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SELECTION OF PROTEINS 
5 USING RNA-PROTEIN FUSIONS 

Background of the Invention 
This application is a continuation-in-part of co-pending application, 
Szostak et aL, U.S. S.N. 09/007,005, filed January 14, 1998, which claims benefit from 
provisional applications, Szostak et al., U.S. S.N. 60/064,491, filed November 6, 1997, 
10 now abandoned, and U.S. S.N. 60/035,963, filed January 21, 1997, now abandoned. 
This invention relates to protein selection methods. 
The invention was made with government support under grant 
F32 GM17776-01 and F32 GM17776-02. The government has certain rights in the 
invention. 

15 Methods currently exist for the isolation of RNA and DNA molecules 

based on their functions. For example, experiments of Ellington and Szostak (Nature 
346:818 (1990); and Nature 355:850 (1992)) and Tuerk and Gold (Science 249:505 

(1990) ; and J. Mol. Biol 222:739 (1991) ) have demonstrated that very rare (i.e., less 
than 1 in 10 13 ) nucleic acid molecules with desired properties may be isolated out of 

20 complex pools of molecules by repeated rounds of selection and amplification. These 
methods offer advantages over traditional genetic selections in that (i) very large 
candidate pools may be screened ( > 10 15 ), (ii) host viability and in vivo conditions are 
not concerns, and (iii) selections may be carried out even if an in vivo genetic screen 
does not exist. The power of in vitro selection has been demonstrated in defining 

25 novel RNA and DNA sequences with very specific protein binding functions (see, for 
example, Tuerk and Gold, Science 249:505 (1990); Irvine et al., J. Mol. Biol 222:739 

(1991) ; Oliphant et al., Mol. Cell Biol. 9:2944 (1989); Blackwell et al., Science 
250:1104 (1990); Pollock and Treisman, Nuc. Acids Res. 18:6197 (1990); Thiesen 
and Bach, Nuc. Acids Res. 18:3203 (1990); Bartel et al., Cell 57:529 (1991); Stormo 
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and Yoshioka, Proc. Natl. Acad. Sci. USA 88:5699 (1991); and Bock et al., Nature 
355:564 (1992)), small molecule binding functions (Ellington and Szostak, Nature 
346:818 (1990); Ellington and Szostak, Nature 355:850 (1992)), and catalytic 
functions (Green et al., Nature 347:406 (1990); Robertson and Joyce, Nature 344:467 
5 (1990); Beaudry and Joyce, Science 257:635 (1992); Bartel and Szostak, Science 
261:141 1 (1993); Lorsch and Szostak, Nature 371:31-36 (1994); Cuenoud and 
Szostak, Nature 375:61 1-614 (1995); Chapman and Szostak, Chemistry and Biology 
2:325-333 (1995); and Lohse and Szostak, Nature 381:442-444 (1996)). A similar 
scheme for the selection and amplification of proteins has not been demonstrated. 



10 Summary of the Invention 

The purpose of the present invention is to allow the principles of in vitro 
selection and in vitro evolution to be applied to proteins. The invention facilitates the 
isolation of proteins with desired properties from large pools of partially or 
completely random amino acid sequences. In addition, the invention solves the 

15 problem of recovering and amplifying the protein sequence information by covalently 
attaching the mRNA coding sequence to the protein molecule. 

In general, the inventive method consists of an in vitro or in situ 
transcription/ translation protocol that generates protein covalently linked to the 3' end 
of its own mRNA, i.e., an RNA-protein fusion. This is accomplished by synthesis 

20 and in vitro or in situ translation of an mRNA molecule with a peptide acceptor 
attached to its 3' end. One preferred peptide acceptor is puromycin, a nucleoside 
analog that adds to the C-terminus of a growing peptide chain and terminates 
translation. In one preferred design, a DNA sequence is included between the end of 
the message and the peptide acceptor which is designed to cause the ribosome to 

25 pause at the end of the open reading frame, providing additional time for the peptide 
acceptor (for example, puromycin) to accept the nascent peptide chain before 
hydrolysis of the peptidyl-tRNA linkage. 

If desired, the resulting RNA-protein fusion allows repeated rounds of 



WO 00/47775 



PCT/US00/02589 



3 

selection and amplification because the protein sequence information may be 
recovered by reverse transcription and amplification (for example, by PCR 
amplification as well as any other amplification technique, including RNA-based 
amplification techniques such as 3SR or TSA). The amplified nucleic acid may then 
5 be transcribed, modified, and in vitro or in situ translated to generate mRNA-protein 
fusions for the next round of selection. The ability to carry out multiple rounds of 
selection and amplification enables the enrichment and isolation of very rare 
molecules, e.g., one desired molecule out of a pool of 10 15 members. This in turn 
allows the isolation of new or improved proteins which specifically recognize 
1 0 virtually any target or which catalyze desired chemical reactions. 

Accordingly, in a first aspect, the invention features a method for selection 
of a desired protein, involving the steps of: (a) providing a population of candidate 
RNA molecules, each of which includes a translation initiation sequence and a start 
codon operably linked to a candidate protein coding sequence and each of which is 

15 operably linked to a peptide acceptor at the 3' end of the candidate protein coding 

sequence; (b) in vitro or in situ translating the candidate protein coding sequences to 
produce a population of candidate RNA-protem fusions; and (c) selecting a desired 
RNA-protein fusion, thereby selecting the desired protein. 

In a related aspect, the invention features a method for selection of a DNA 

20 molecule which encodes a desired protein, involving the steps of: (a) providing a 
population of candidate RNA molecules, each of which includes a translation 
initiation sequence and a start codon operably linked to a candidate protein coding 
sequence and each of which is operably linked to a peptide acceptor at the 3' end of 
the candidate protein coding sequence; (b) in vitro or in situ translating the candidate 

25 protein coding sequences to produce a population of candidate RNA-protein fusions; 
(c) selecting a desired RNA-protein fusion; and (d) generating from the RNA portion 
of the fusion a DNA molecule which encodes the desired protein. 

In another related aspect, the invention features a method for selection of a 
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protein having an altered function relative to a reference protein, involving the steps 
of: (a) producing a population of candidate RNA molecules from a population of 
DNA templates, the candidate DNA templates each having a candidate protein coding 
sequence which differs from the reference protein coding sequence, the RNA 
5 molecules each comprising a translation initiation sequence and a start codon operably 
linked to the candidate protein coding sequence and each being operably linked to a 
peptide acceptor at the 3' end; (b) in vitro or in situ translating the candidate protein 
coding sequences to produce a population of candidate RNA-protein fusions; and (c) 
selecting an RNA-protein fusion having an altered function, thereby selecting the 

10 protein having the altered function. 

In yet another related aspect, the invention features a method for selection 
of a DNA molecule which encodes a protein having an altered function relative to a 
reference protein, involving the steps of: (a) producing a population of candidate RNA 
molecules from a population of candidate DNA templates, the candidate DNA 

15 templates each having a candidate protein coding sequence which differs from the 

reference protein coding sequence, the RNA molecules each comprising a translation 
initiation sequence and a start codon operably linked to the candidate protein coding 
sequence and each being operably linked to a peptide acceptor at the 3' end; (b) in 
vitro or in situ translating the candidate protein coding sequences to produce a 

20 population of RNA-protein fusions; (c) selecting an RNA-protein fusion having an 
altered function; and (d) generating from the RNA portion of the fusion a DNA 
molecule which encodes the protein having the altered function. 

In yet another related aspect, the invention features a method for selection 
of a desired RNA, involving the steps of: (a) providing a population of candidate 

25 RNA molecules, each of which includes a translation initiation sequence and a start 
codon operably linked to a candidate protein coding sequence and each of which is 
operably linked to a peptide acceptor at the 3' end of the candidate protein coding 
sequence; (b) m vitro or in situ translating the candidate protein coding sequences to 
produce a population of candidate RNA-protein fusions; and (c) selecting a desired 
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RNA-protein fusion, thereby selecting the desired RNA. 

In preferred embodiments of the above methods, the peptide acceptor is 
puromycin; each of the candidate RNA molecules further includes a pause sequence 
or further includes a DNA or DNA analog sequence covalently bonded to the 3 f end of 
5 the RNA; the population of candidate RNA molecules includes at least 10 9 , 
preferably, at least 10 10 , more preferably, at least 10 u , 10 12 , or 10 13 , and, most 
preferably, at least 10 14 different RNA molecules; the in vitro translation reaction is 
carried out in a lysate prepared from a eukaryotic cell or portion thereof (and is, for 
example, carried out in a reticulocyte lysate or wheat germ lysate); the in vitro 

1 0 translation reaction is carried out in an extract prepared from a prokaryotic cell (for 
example, R. coh) or portion thereof; the selection step involves binding of the desired 
protein to an immobilized binding partner; the selection step involves assaying for a 
functional activity of the desired protein; the DNA molecule is amplified; the method 
further involves repeating the steps of the above selection methods; the method 

1 5 further involves transcribing an RNA molecule from the DNA molecule and repeating 
steps (a) through (d); following the in vitro translating step, the method further 
involves an incubation step carried out in the presence of 50-100 mM Mg 2+ ; and the 
RNA-protein fusion further includes a nucleic acid or nucleic acid analog sequence 
positioned proximal to the peptide acceptor which increases flexibility. 

20 In other related aspects, the invention features an RNA-protein fusion 

selected by any of the methods of the invention; a ribonucleic acid covalently bonded 
though an amide bond to an amino acid sequence, the amino acid sequence being 
encoded by the ribonucleic acid; and a ribonucleic acid which includes a translation 
initiation sequence and a start codon operably linked to a candidate protein coding 

25 sequence, the ribonucleic acid being operably linked to a peptide acceptor (for 
example, puromycin) at the 3' end of the candidate protein coding sequence. 

In a second aspect, the invention features a method for selection of a 
desired protein or desired RNA through enrichment of a sequence pool. This method 
involves the steps of: (a) providing a population of candidate RNA molecules, each 
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of which includes a translation initiation sequence and a start codon operably linked 
to a candidate protein coding sequence and each of which is operably linked to a 
peptide acceptor at the 3' end of the candidate protein coding sequence; (b) in vitro or 
in situ translating the candidate protein coding sequences to produce a population of 
5 candidate RNA-protein fusions; (c) contacting the population of RNA-protein fusions 
with a binding partner specific for either the RNA portion or the protein portion of the 
RNA-protein fusion under conditions which substantially separate the binding 
partner-RNA-protein fusion complexes from unbound members of the population; (d) 
releasing the bound RNA-protein fusions from the complexes; and (e) contacting the 
10 population of RNA-protein fusions from step (d) with a binding partner specific for 
the protein portion of the desired RNA-protein fusion under conditions which 
substantially separate the binding partner-RNA-protein fusion complex from unbound 
members of said population, thereby selecting the desired protein and the desired 
RNA. 

15 In preferred embodiments, the method further involves repeating steps (a) 

through (e). In addition, for these repeated steps, the same or different binding 
partners may be used, in any order, for selective enrichment of the desired RNA- 
protein fusion. In another preferred embodiment, step (d) involves the use of a 
binding partner (for example, a monoclonal antibody) specific for the protein portion 

20 of the desired fusion. This step is preferably carried out following reverse 

transcription of the RNA portion of the fusion to generate a DNA which encodes the 
desired protein. If desired, this DNA may be isolated and/or PCR amplified. This 
enrichment technique may be used to select a desired protein or may be used to select 
a protein having an altered function relative to a reference protein. 

25 In other preferred embodiments of the enrichment methods, the peptide 

acceptor is puromycin; each of the candidate RNA molecules further includes a pause 
sequence or further includes a DNA or DNA analog sequence covalently bonded to 
the 3' end of the RNA; the population of candidate RNA molecules includes at least 
10 9 , preferably, at least 10 10 , more preferably, at least 10 n , 10 12 , or 10 13 , and, most 
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preferably, at least 1 0 14 different RNA molecules; the in vitro translation reaction is 
carried out in a lysate prepared from a eukaryotic cell or portion thereof (and is, for 
example, carried out in a reticulocyte lysate or wheat germ lysate); the in vitro 
translation reaction is carried out in an extract prepared from a prokaryotic cell or 
5 portion thereof (for example, RcoH); the DNA molecule is amplified; at least one of 
the binding partners is immobilized on a solid support; following the in vitro 
translating step, the method further involves an incubation step carried out in the 
presence of 50-100 mM Mg 2+ ; and the RNA-protein fusion further includes a nucleic 
acid or nucleic acid analog sequence positioned proximal to the peptide acceptor 

10 which increases flexibility. 

In a related aspect, the invention features methods for producing libraries 
(for example, protein, DNA, or RNA-fusion libraries) or methods for selecting desired 
molecules (for example, protein, DNA, or RNA molecules or molecules having a 
particular function or altered function) which involve a step of post-translational 

15 incubation in the presence of high salt (including, without limitation, high salt which 
includes a monovalent cation, such as K + , NH 4 + , or Na + , a divalent cation, such as 
Mg +2 , or a combination thereof). This incubation may be carried out at approximately 
room temperature or approximately -20 °C and preferred salt concentrations of 
between approximately 125 mM - 1.5 M (more preferably, between approximately 

20 300 mM - 600 mM) for monovalent cations and between approximately 25 mM - 200 
mM for divalent cations. 

In another related aspect, the invention features kits for carrying out any of 
the selection methods described herein. 

In a third and final aspect, the invention features a microchip that includes 

25 an array of immobilized single-stranded nucleic acids, the nucleic acids being 

hybridized to RNA-protein fusions. Preferably, the protein component of the RNA- 
protein fusion is encoded by the RNA. 

As used herein, by a "population" is meant more than one molecule (for 
example, more than one RNA, DNA, or RNA-protein fusion molecule). Because the 
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methods of the invention facilitate selections which begin, if desired, with large 
numbers of candidate molecules, a "population" according to the invention preferably 
means more than 10 9 molecules, more preferably, more than 10 11 , 10 12 , or 10 13 
molecules, and, most preferably, more than 10 13 molecules. 
5 By "selecting" is meant substantially partitioning a molecule from other 

molecules in a population. As used herein, a "selecting" step provides at least a 2- 
fold, preferably, a 30-fold, more preferably, a 100-fold, and, most preferably, a 1000- 
fold enrichment of a desired molecule relative to undesired molecules in a population 
following the selection step. As indicated herein, a selection step may be repeated 
10 any number of times, and different types of selection steps may be combined in a 
given approach. 

By a "protein" is meant any two or more naturally occurring or modified 
amino acids joined by one or more peptide bonds. "Protein" and "peptide" are used 
interchangeably herein. 
15 By "RNA" is meant a sequence of two or more covalently bonded, 

naturally occurring or modified ribonucleotides. One example of a modified RNA 
included within this term is phosphorothioate RNA. 

By a "translation initiation sequence" is meant any sequence which is 
capable of providing a functional ribosome entry site. In bacterial systems, this 
20 region is sometimes referred to as a Shine-Dalgarno sequence. 

By a "start codon" is meant three bases which signal the beginning of a 
protein coding sequence. Generally, these bases are AUG (or ATG); however, any 
other base triplet capable of being utilized in this manner may be substituted. 

By "covalently bonded" to a peptide acceptor is meant that the peptide 
25 acceptor is joined to a "protein coding sequence" either directly through a covalent 
bond or indirectly through another covalently bonded sequence (for example, DNA 
corresponding to a pause site). 

By a "peptide acceptor" is meant any molecule capable of being added to 
the C-terminus of a growing protein chain by the catalytic activity of the ribosomal 
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peptidyl transferase function. Typically, such molecules contain (i) a nucleotide or 
nucleotide-like moiety (for example, adenosine or an adenosine analog (di- 
methylation at the N-6 amino position is acceptable)), (ii) an amino acid or amino 
acid-like moiety (for example, any of the 20 D- or L-amino acids or any amino acid 
5 analog thereof (for example, O-methyl tyrosine or any of the analogs described by 

Ellman et al., Meth. Enzymol. 202:301, 1991), and (iii) a linkage between the two (for 
example, an ester, amide, or ketone linkage at the 3' position or, less preferably, the T 
position); preferably, this linkage does not significantly perturb the pucker of the ring 
from the natural ribonucleotide conformation. Peptide acceptors may also possess a 
10 nucleophile, which may be, without limitation, an amino group, a hydroxyl group, or 
a sulfhydryl group. In addition, peptide acceptors may be composed of nucleotide 
mimetics, amino acid mimetics, or mimetics of the combined nucleotide-amino acid 
structure. 

By a peptide acceptor being positioned "at the 3' end" of a protein coding 
15 sequence is meant that the peptide acceptor molecule is positioned after the final 
codon of that protein coding sequence. This term includes, without limitation, a 
peptide acceptor molecule that is positioned precisely at the 3' end of the protein 
coding sequence as well as one which is separated from the final codon by intervening 
coding or non-coding sequence (for example, a sequence corresponding to a pause 
20 site). This term also includes constructs in which coding or non-coding sequences 
follow (that is, are 3 1 to) the peptide acceptor molecule. In addition, this term 
encompasses, without limitation, a peptide acceptor molecule that is covalently 
bonded (either directly or indirectly through intervening nucleic acid sequence) to the 
protein coding sequence, as well as one that is joined to the protein coding sequence 
25 by some non-covalent means, for example, through hybridization using a second 

nucleic acid sequence that binds at or near the 3' end of the protein coding sequence 
and that itself is bound to a peptide acceptor molecule. 

By an "altered function" is meant any qualitative or quantitative change in 
the function of a molecule. 
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By a "pause sequence" is meant a nucleic acid sequence which causes a 
ribosome to slow or stop its rate of translation. 

By "binding partner," as used herein, is meant any molecule which has a 
specific, covalent or non-covalent affinity for a portion of a desired RNA-protein 
fusion. Examples of binding partners include, without limitation, members of 
antigen/antibody pairs, protein/inhibitor pairs, receptor/ligand pairs (for example cell 
surface receptor/ligand pairs, such as hormone receptor/peptide hormone pairs), 
enzyme/substrate pairs (for example, kinase/substrate pairs), lectin/carbohydrate pairs, 
oligomeric or heterooligomeric protein aggregates, DNA binding protein/DNA 
binding site pairs, RNA/protein pairs, and nucleic acid duplexes, heteroduplexes, or 
ligated strands, as well as any molecule which is capable of forming one or more 
covalent or non-covalent bonds (for example, disulfide bonds) with any portion of an 
RNA-protein fusion. Binding partners include, without limitation, any of the 
"selection motifs" presented in Figure 2. 

By a "solid support" is meant, without limitation, any column (or column 
material), bead, test tube, microtiter dish, solid particle (for example, agarose or 
sepharose), microchip (for example, silicon, silicon-glass, or gold chip), or membrane 
(for example, the membrane of a liposome or vesicle) to which an affinity complex 
may be bound, either directly or indirectly (for example, through other binding partner 
intermediates such as other antibodies or Protein A), or in which an affinity complex 
may be embedded (for example, through a receptor or channel). 

By "high salt" is meant having a concentration of a monovalent cation of 
at least 200 mM, and, preferably, at least 500 mM or even 1 M, and/or a concentration 
of a divalent or higher valence cation of at least 25 mM, preferably, at least 50 mM, 
and, most preferably, at least 100 mM. 

The presently claimed invention provides a number of significant 
advantages. To begin with, it is the first example of this type of scheme for the 
selection and amplification of proteins. This technique overcomes the impasse 
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created by the need to recover nucleotide sequences corresponding to desired, isolated 
proteins (since only nucleic acids can be replicated). In particular, many prior 
methods that allowed the isolation of proteins from partially or fully randomized 
pools did so through an in vivo step. Methods of this sort include monoclonal 
5 antibody technology (Milstein, Sci. Amer. 243:66 (1980); and Schultz et aL, J. Chem. 
Engng. News 68:26 (1990)), phage display (Smith, Science 228:1315 (1985); Parmley 
and Smith, Gene 73:305 (1988); and McCafferty et aL, Nature 348:552 (1990)), 
peptide-lac repressor fusions (Cull et aL, Proc. Natl. Acad. Sci. USA 89:1865 (1992)), 
and classical genetic selections. Unlike the present technique, each of these methods 
10 relies on a topological link between the protein and the nucleic acid so that the 

information of the protein is retained and can be recovered in readable, nucleic acid 
form. 

In addition, the present invention provides advantages over the stalled 
translation method (Tuerk and Gold, Science 249:505 (1990); Irvine et aL, J. Mol. 

15 Biol 222:739 (1991); Korman et aL, Proc. Natl. Acad. Sci. USA 79:1844-1848 
(1982); Mattheakis et aL, Proc. Natl. Acad. Sci. USA 91:9022-9026 (1994); 
Mattheakis et aL, Meth. Enzymol. 267:195 (1996); and Hanes and Pluckthun, Proc. 
Natl. Acad. Sci. USA 94:4937 (1997)), a technique in which selection is for some 
property of a nascent protein chain that is still complexed with the ribosome and its 

20 mRNA. Unlike the stalled translation technique, the present method does not rely on 
maintaining the integrity of an mRNA: ribosome: nascent chain ternary complex, a 
complex that is very fragile and is therefore limiting with respect to the types of 
selections which are technically feasible. 

The present method also provides advantages over the branched synthesis 

25 approach proposed by Brenner and Lerner (Proc. Natl. Acad. Sci. USA 89:5381-5383 
(1992)), in which DNA-peptide fusions are generated, and genetic information is 
theoretically recovered following one round of selection. Unlike the branched 
synthesis approach, the present method does not require the regeneration of a peptide 
from the DNA portion of a fusion (which, in the branched synthesis approach, is 
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generally accomplished by individual rounds of chemical synthesis). Accordingly, 
the present method allows for repeated rounds of selection using populations of 
candidate molecules. In addition, unlike the branched synthesis technique, which is 
generally limited to the selection of fairly short sequences, the present method is 
5 applicable to the selection of protein molecules of considerable length. 

In yet another advantage, the present selection and directed evolution 
technique can make use of very large and complex libraries of candidate sequences. 
In contrast, existing protein selection methods which rely on an in vivo step are 
typically limited to relatively small libraries of somewhat limited complexity. This 

10 advantage is particularly important when selecting functional protein sequences 

considering, for example, that 10 13 possible sequences exist for a peptide of only 10 
amino acids in length. In classical genetic techniques, lac repressor fusion 
approaches, and phage display methods, maximum complexities generally fall orders 
of magnitude below 10 13 members. Large library size also provides an advantage for 

15 directed evolution applications, in that sequence space can be explored to a greater 
depth around any given starting sequence. 

The present technique also differs from prior approaches in that the 
selection step is context-independent. In many other selection schemes, the context in 
which, for example, an expressed protein is present can profoundly influence the 

20 nature of the library generated. For example, an expressed protein may not be 
properly expressed in a particular system or may not be properly displayed (for 
example, on the surface of a phage particle). Alternatively, the expression of a protein 
may actually interfere with one or more critical steps in a selection cycle, e.g., phage 
viability or infectivity, or lac repressor binding. These problems can result in the loss 

25 of functional molecules or in limitations on the nature of the selection procedures that 
may be applied. 

Finally, the present method is advantageous because it provides control 
over the repertoire of proteins that may be tested. In certain techniques (for example, 
antibody selection), there exists little or no control over the nature of the starting pool. 
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In yet other techniques (for example, lac fusions and phage display), the candidate 
pool must be expressed in the context of a fusion protein. In contrast, RNA-protein 
fusion constructs provide control over the nature of the candidate pools available for 
screening. In addition, the candidate pool size has the potential to be as high as RNA 
5 or DNA pools (~ 10 15 members), limited only by the size of the in vitro translation 
reaction performed. And the makeup of the candidate pool depends completely on 
experimental design; random regions may be screened in isolation or within the 
context of a desired fusion protein, and most if not all possible sequences may be 
expressed in candidate pools of RNA-protein fusions. 
10 Other features and advantages of the invention will be apparent from the 

following detailed description, and from the claims. 



Detailed Description 
The drawings will first briefly be described. 



15 Brief Description of the Drawings 

FIGURES 1 A-1C are schematic representations of steps involved in the 
production of RNA-protein fusions. Figure 1A illustrates a sample DNA construct for 
generation of an RNA portion of a fusion. Figure IB illustrates the generation of an 
RNA/puromycin conjugate. And Figure 1C illustrates the generation of an RNA- 
20 protein fusion. 

FIGURE 2 is a schematic representation of a generalized selection 
protocol according to the invention. 

FIGURE 3 is a schematic representation of a synthesis protocol for 
minimal translation templates containing 3' puromycin. Step (A) shows the addition 
25 of protective groups to the reactive functional groups on puromycin (5'-OH and NH 2 ); 
as modified, these groups are suitably protected for use in phosphoramidite based 
oligonucleotide synthesis. The protected puromycin was attached to aminohexyl 
controlled pore glass (CPG) through the 2'OH group using the standard protocol for 
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attachment of DNA through its 3'OH (Gait, Oligonucleotide Synthesis, A Practical 
Approach, The Practical Approach Series (IRL Press, Oxford, 1984)). In step (B), a 
minimal translation template (termed "43-P"), which contained 43 nucleotides, was 
synthesized using standard RNA and DNA chemistry (Millipore, Bedford, MA), 
5 deprotected using NH 4 OH and TBAF, and gel purified. The template contained 13 
bases of RNA at the 5' end followed by 29 bases of DNA attached to the 3' puromycin 
at its 5 1 OH. The RNA sequence contained (i) a Shine-Dalgarno consensus sequence 
complementary to five bases of 16S rRNA (Stormo et al., Nucleic Acids Research 
10:2971-2996 (1982); Shine and Dalgarno, Proc. Natl. Acad. Sci. USA 71:1342-1346 

10 (1974); and Steitz and Jakes, Proc. Natl. Acad. Sci. USA 72:4734-4738 (1975)), (ii) a 
five base spacer, and (iii) a single AUG start codon. The DNA sequence was 
dA 27 dCdCP, where "P" is puromycin. 

FIGURE 4 is a schematic representation of a preferred method for the 
preparation of protected CPG-linked puromycin. 

15 FIGURE 5 is a schematic representation showing possible modes of 

methionine incorporation into a template of the invention. As shown in reaction (A), 
the template binds the ribosome, allowing formation of the 70S initiation complex. 
Fmet tRNA binds to the P site and is base paired to the template. The puromycin at 
the 3' end of the template enters the A site in an intramolecular fashion and forms an 

20 amide linkage to N-formyl methionine via the peptidyl transferase center, thereby 
deacylating the tRNA. Phenol/chloroform extraction of the reaction yields the 
template with methionine covalently attached. Shown in reaction (B) is an undesired 
intermolecular reaction of the template with puromycin containing oligonucleotides. 
As before, the minimal template stimulates formation of the 70S ribosome containing 

25 fmet tRNA bound to the P site. This is followed by entiy of a second template in 
trans to give a covalently attached methionine. 

FIGURES 6A-6H are photographs showing the incorporation of 35 S 
methionine ( 35 S met) into translation templates. Figure 6A demonstrates magnesium 
(Mg 2 ) dependence of the reaction. Figure 6B demonstrates base stability of the 
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product; the change in mobility shown in this figure corresponds to a loss of the 5' 
RNA sequence of 43-P (also termed "Met template") to produce the DNA-puromycin 
portion, termed 30-P. The retention of the label following base treatment was 
consistent with the formation of a peptide bond between 35 S methionine and the 3' 
5 puromycin of the template. Figure 6C demonstrates the inhibition of product 

formation in the presence of peptidyl transferase inhibitors. Figure 6D demonstrates 
the dependence of 35 S methionine incorporation on a template coding sequence. 
Figure 6E demonstrates DNA template length dependence of 35 S methionine 
incorporation. Figure 6F illustrates cis versus trans product formation using templates 

10 43-P and 25-P. Figure 6G illustrates cis versus trans product formation using 

templates 43-P and 13-P. Figure 6H illustrates cis versus trans product formation 
using templates 43-P and 30-P in a reticulocyte lysate system. 

FIGURES 7A-7C are schematic illustrations of constructs for testing 
peptide fusion formation and selection. Figure 7A shows LP77 ("ligated-product," 

15 "77" nucleotides long) (also termed, "short myc template") (SEQ ID NO: 1). This 
sequence contains the c-myc monoclonal antibody epitope tag EQKLISEEDL (SEQ 
ID NO: 2) (Evan et al., Mol. Cell Biol. 5:3610-3616 (1985)) flanked by a 5' start 
codon and a 3' linker. The 5' region contains a bacterial Shine-Dalgarno sequence 
identical to that of 43-P. The coding sequence was optimized for translation in 

20 bacterial systems. In particular, the 5' UTRs of 43-P and LP77 contained a 

Shine-Dalgarno sequence complementary to five bases of 16S rRNA (Steitz and 
Jakes, Proc. Natl. Acad. Sci. USA 72:4734-4738 (1975)) and spaced similarly to 
ribosomal protein sequences (Stormo et al, Nucleic Acids Res. 10:2971-2996 (1982)). 
Figure 7B shows LP 154 (ligated product, 154 nucleotides long) (also termed "long 

25 myc template") (SEQ ID NO: 3). This sequence contains the code for generation of 
the peptide used to isolate the c-myc antibody. The 5' end contains a truncated 
version of the TMV upstream sequence (designated "TE). This 5' UTR contained a 
22 nucleotide sequence derived from the TMV 5' UTR encompassing two 
ACAAAUUAC direct repeats (Gallie et al., Nucl. Acids Res. 16:883 (1988)). Figure 
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7C shows Pool #1 (SEQ ID NO: 4), an exemplary sequence to be used for peptide 
selection. The final seven amino acids from the original myc peptide were included in 
the template to serve as the 3' constant region required for PCR amplification of the 
template. This sequence is known not to be part of the antibody binding epitope. 
5 FIGURE 8 is a photograph demonstrating the synthesis of RNA-protein 

fusions using templates 43-P, LP77, and LP154, and reticulocyte ("Retic") and wheat 
germ ("Wheat") translation systems. The left half of the figure illustrates 35 S 
methionine incorporation in each of the three templates. The right half of the figure 
illustrates the resulting products after RNase A treatment of each of the three 

10 templates to remove the RNA coding region; shown are 35 S methionine-labeled DNA- 
protein fusions. The DNA portion of each was identical to the oligo 30-P. Thus, 
differences in mobility were proportional to the length of the coding regions, 
consistent with the existence of proteins of different length in each case. 

FIGURE 9 is a photograph demonstrating protease sensitivity of an RNA- 

15 protein fusion synthesized from LP 154 and analyzed by denaturing polyacrylamide 
gel electrophoresis. Lane 1 contains 32 P labeled 30-P. Lanes 2-4, 5-7, and 8-10 
contain the 35 S labeled translation templates recovered from reticulocyte lysate 
reactions either without treatment, with RNase A treatment, or with RNase A and 
proteinase K treatment, respectively. 

20 FIGURE 10 is a photograph showing the results of immunoprecipitation 

reactions using in vitro translated 33 amino acid myc-epitope protein. Lanes 1 and 2 
show the translation products of the myc epitope protein and p-globin templates, 
respectively. Lanes 3-5 show the results of immunoprecipitation of the myc-epitope 
peptide using a c-myc monoclonal antibody and PBS, DB, and PBSTDS wash 

25 buffers, respectively. Lanes 6-8 show the same immunoprecipitation reactions, but 
using the P-globin translation product. 

FIGURE 1 1 is a photograph demonstrating immunoprecipitation of an 
RNA-protein fusion from an in vitro translation reaction. The picomoles of template 
used in the reaction are indicated. Lanes 1-4 show RNA 124 (the RNA portion of 
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fusion LP154), and lanes 5-7 show RNA-protein fusion LP154. After 
immunoprecipitation using a c-myc monoclonal antibody and protein G sepharose, 
the samples were treated with RNase A and T4 polynucleotide kinase, then loaded on 
a denaturing urea poly acrylamide gel to visualize the fusion. In lanes 1-4, with 
5 samples containing either no template or only the RNA portion of the long myc 
template (RNA 124), no fusion was seen. In lanes 5-7, bands corresponding to the 
fusion were clearly visualized. The position of 32 P labeled 30-P is indicated, and the 
amount of input template is indicated at the top of the figure. 

FIGURE 12 is a graph showing a quantitation of fusion material obtained 

1 0 from an in vitro translation reaction. The intensity of the fusion bands shown in lanes 
5-7 of Figure 1 1 and the 30-P band (isolated in a parallel fashion on dT 25 , not shown) 
were quantitated on phosphorimager plates and plotted as a function of input LP 154 
concentration. Recovered modified 30-P (left y axis) was linearly proportional to 
input template (x axis), whereas linker-peptide fusion (right y axis) was constant. 

15 From this analysis, it was calculated that ~10 12 fusions were formed per ml of 
translation reaction sample. 

FIGURE 13 is a schematic representation of thiopropyl sepharose and dT 25 
agarose, and the ability of these substrates to interact with the RNA-protein fusions of 
the invention. 

20 FIGURE 14 is a photograph showing the results of sequential isolation of 

fusions of the invention. Lane 1 contains 32 P labeled 30-P. Lanes 2 and 3 show 
LP 154 isolated from translation reactions and treated with RNase A. In lane 2, LP 154 
was isolated sequentially, using thiopropyl sepharose followed by dT 25 agarose. Lane 
3 shows isolation using only dT 25 agarose. The results indicated that the product 

25 contained a free thiol, likely the penultimate cysteine in the myc epitope coding 
sequence. 

FIGURES 15A and 15B are photographs showing the formation of fusion 
products using P-globin templates as assayed by SDS-tricine-PAGE (polyacrylamide 
gel electrophoresis). Figure 15A shows incorporation of 35 S using either no template 
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(lane 1), a syn-p-globin template (lanes 2-4), or an LP-P-globin template (lanes 5-7). 
Figure 15B (lanes labeled as in Fig. 15 A) shows 35 S-labeled material isolated by 
oligonucleotide affinity chromatography. No material was isolated in the absence of a 
30-P tail (lanes 2-4). 

5 FIGURES 16A-16C are diagrams and photographs illustrating enrichment 

of myc dsDNA versus pool dsDNA by in vitro selection. Figure 16A is a schematic 
of the selection protocol. Four mixtures of the myc and pool templates were 
translated in vitro and isolated on dT 25 agarose followed by TP sepharose to purify the 
template fusions from unmodified templates. The mRNA-peptide fusions were then 

10 reverse transcribed to suppress any secondary or tertiary structure present in the 

templates. Aliquots of each mixture were removed both before (Figure 16B) and after 
(Figure 16C) affinity selection, amplified by PCR in the presence of a labeled primer, 
and digested with a restriction enzyme that cleaved only the myc DNA. The input 
mixtures of templates were pure myc (lane 1), or a 1 :20, 1 :200, or 1 :2000 myc:pool 

15 (lanes 2-4). The unselected material deviated from the input ratios due to preferential 
translation and reverse transcription of the myc template. The enrichment of the myc 
template during the selective step was calculated from the change in the poohmyc 
ratio before and after selection. 

FIGURE 17 is a photograph illustrating the translation of myc RNA 

20 templates. The following linkers were used: lanes 1-4, dA 27 dCdCP; lanes 5-8, 

dA 27 rCrCP; and lanes 9-12, dA 21 C 9 C 9 C 9 dAdCdCP. In each lane, the concentration of 
RNA template was 600 nM, and 35 S-Met was used for labeling. Reaction conditions 
were as follows: lanes 1,5, and 9, 30° C for 1 hour; lanes 2, 6, and 10, 30° C for 2 
hours; lane 3, 7, and 11, 30°C for 1 hour, -20°C for 16 hours; and lanes 4, 8, and 12, 

25 30°C for 1 hour, -20°C for 16 hours with 50 mM Mg 2+ . In this Figure, "A" represents 
free peptide, and "B" represent mRNA-peptide fusion. 

FIGURE 1 8 is a photograph illustrating the translation of myc RNA 
templates labeled with 32 P. The linker utilized was dA 21 C 9 C 9 C 9 dAdCdCP. 
Translation was performed at 30 °C for 90 minutes, and incubations were carried out 
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at -20 °C for 2 days without additional Mg 2+ . The concentrations of mRNA templates 
were 400 nM (lane 3), 200 nM (lane 4), 100 nM (lane 5), and 100 11M (lane 6). Lane 
1 shows mRNA-peptide fusion labeled with 35 S-Met Lane 2 shows mRNA labeled 
with 32 P. In lane 6, the reaction was carried out in the presence of 0.5 mM cap analog. 
5 FIGURE 19 is a photograph illustrating the translation of myc RNA 

template using lysate obtained from Ambion (lane 1), Novagen (lane 2), and 
Amersham (lane 3). The linker utilized was dA 27 dCdCP. The concentration of the 
template was 600 nM, and 35 S~Met was used for labeling. Translations were 
performed at 30°C for 1 hour, and incubations were carried out at -20°C overnight in 
10 the presence of 50 mM Mg 2+ . 

FIGURE 20 is a graph illustrating enrichment of RNA-peptide fusions 
bound by anti-myc monoclonal antibody 9E10 during six rounds of in vitro selection. 

FIGURE 21 is a graph showing competition assays with synthetic myc 

peptides. 

15 FIGURE 22 is a schematic representation illustrating the amino acid 

sequences of 12 selected peptides from a random 27-mer library. 

FIGURE 23 is a photograph illustrating the effect of linker length on 
fusion formation. In this figure, Myc templates containing linkers [N] = 13, 19, 25, 
30, 35, 40, 45, or 50 nucleotides long (dA 1(M7 dCdCP) were assayed for fusion 

20 formation by SDS-PAGE. The flexible linker F (dA 21 [C9] 3 dAdCdCP) is also shown. 
Translations were performed with 600 nM template at 30° C for 90 minutes, followed 
by addition of 50 mM Mg +2 and incubation at -20 °C for two days. 

FIGURE 24 is a photograph illustrating co-translation of myc and AJPPase 
mRNA. In this figure, 200 nM of APPase RNA (RNA716) and/or 50 nM myc RNA 

25 (RNA152) containing the flexible linker F (dA 2! [C9] 3 dAdCdCP) were translated with 
[ 35 S]-Met. Mg +2 (75 mM) was added, followed by incubation at -20°C. No bands 
were observed from cross-products (myc templates fusion to A-PPase protein). 

Described herein is a general method for the selection of proteins with 
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desired functions using fusions in which these proteins are covalently linked to their 
own messenger RNAs. These RNA-protein fusions are synthesized by in vitro or in 
situ translation of mRNA pools containing a peptide acceptor attached to their 3' ends 
(Figure IB). In one preferred embodiment, after readthrough of the open reading 
5 frame of the message, the ribosome pauses when it reaches the designed pause site, 
and the acceptor moiety occupies the ribosomal A site and accepts the nascent peptide 
chain from the peptidyl-tRNA in the P site to generate the RNA-protein fusion 
(Figure 1 C). The covalent link between the protein and the RNA (in the form of an 
amide bond between the 3 1 end of the mRNA and the C-terminus of the protein which 

10 it encodes) allows the genetic information in the protein to be recovered and amplified 
(e.g., by PCR) following selection by reverse transcription of the RNA. Once the 
fusion is generated, selection or enrichment is carried out based on the properties of 
the mRNA-protein fusion, or, alternatively, reverse transcription may be carried out 
using the mRNA template while it is attached to the protein to avoid any effect of the 

15 single-stranded RNA on the selection. When the mRNA-protein construct is used, 
selected fusions may be tested to determine which moiety (the protein, the RNA, or 
both) provides the desired function. 

In one preferred embodiment, puromycin (which resembles tyrosyl 
adenosine) acts as the acceptor to attach the growing peptide to its mRNA. 

20 Puromycin is an antibiotic that acts by terminating peptide elongation. As a mimetic 
of aminoacyl-tRNA, it acts as a universal inhibitor of protein synthesis by binding the 
A site, accepting the growing peptide chain, and falling off the ribosome (at a Kd = 
10- 4 M) (Traut and Monro, J. Mol. Biol. 10:63 (1964); Smith et al, J. Mol. Biol. 
13:617 (1965)). One of the most attractive features of puromycin is the fact that it 

25 forms a stable amide bond to the growing peptide chain, thus allowing for more stable 
fusions than potential acceptors that form unstable ester linkages. In particular, the 
peptidyl-puromycin molecule contains a stable amide linkage between the peptide and 
the O-methyl tyrosine portion of the puromycin. The O-methyl tyrosine is in turn 
linked by a stable amide bond to the 3'-amino group of the modified adenosine 
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Other possible choices for acceptors include tRNA-like structures at the 3' 
end of the mRNA, as well as other compounds that act in a manner similar to 
puromycin. Such compounds include, without limitation, any compound which 
5 possesses an amino acid linked to an adenine or an adenine-like compound, such as 
the amino acid nucleotides, phenylalanyl-adenosine (A-Phe), tyrosyl adenosine (A- 
Tyr), and alanyl adenosine (A-Ala), as well as amide-linked structures, such as 
phenylalanyl 3' deoxy 3* amino adenosine, alanyl 3' deoxy 3' amino adenosine, and 
tyrosyl 3' deoxy 3' amino adenosine; in any of these compounds, any of the naturally- 
10 occurring L- amino acids or their analogs may be utilized. In addition, a combined 
tRNA-like 3' structure-puromycin conjugate may also be used in the invention. 

Shown in Figure 2 is a preferred selection scheme according to the 
invention. The steps involved in this selection are generally carried out as follows. 

Step 1 . Preparation of the DNA template. As a step toward generating 
15 the RNA-protein fusions of the invention, the RNA portion of the fusion is 

synthesized. This may be accomplished by direct chemical RNA synthesis or, more 
commonly, is accomplished by transcribing an appropriate double-stranded DNA 
template. 

Such DNA templates may be created by any standard technique (including 
20 any technique of recombinant DNA technology, chemical synthesis, or both). In 
principle, any method that allows production of one or more templates containing a 
known, random, randomized, or mutagenized sequence may be used for this purpose. 
In one particular approach, an oligonucleotide (for example, containing random bases) 
is synthesized and is amplified (for example, by PCR) prior to transcription. 
25 Chemical synthesis may also be used to produce a random cassette which is then 
inserted into the middle of a known protein coding sequence (see, for example, 
chapter 8.2, Ausubel et al., Current Protocols in Molecular Biology, John Wiley & 
Sons and Greene Publishing Company, 1994). This latter approach produces a high 
density of mutations around a specific site of interest in the protein. 
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An alternative to total randomization of a DNA template sequence is 
partial randomization, and a pool synthesized in this way is generally referred to as a 
"doped" pool. An example of this technique, performed on an RNA sequence, is 
described, for example, by Ekland et al. (NucL Acids Research 23:3231 (1995)). 
5 Partial randomization may be performed chemically by biasing the synthesis reactions 
such that each base addition reaction mixture contains an excess of one base and small 
amounts of each of the others; by careful control of the base concentrations, a desired 
mutation frequency may be achieved by this approach. Partially randomized pools 
may also be generated using error prone PCR techniques, for example, as described in 

10 Beaudry and Joyce (Science 257:635 (1992)) and Bartel and Szostak (Science 
261:1411 (1993)). 

Numerous methods are also available for generating a DNA construct 
beginning with a known sequence and then creating a mutagenized DNA pool. 
Examples of such techniques are described in Ausubel et al. ( supra, chapter 8); 

15 Sambrook et al. (Molecular Cloning: A Laboratory Manual, chapter 1 5, Cold Spring 
Harbor Press, New York, 2 nd ed. (1989); Cadwell et al. (PCR Methods and 
Applications 2:28 (1992)); Tsang et al. (Meth. Enzymol. 267:410 (1996)); Reidhaar- 
Olsen et al. (Meth. Enzymol. 208:564 (1991)); and Ekland and Bartel (Nucl. Acids. 
Res. 23:3231 (1995)). Random sequences may also be generated by the "shuffling" 

20 technique outlined in Stemmer (Nature 370: 389 (1994)). Finally, a set of two or 
more homologous genes can be recombined in vitro to generate a starting library 
(Crameri et al. Nature 391:288-291 (1998)). 

ORFs may be constructed from random sequences in a variety of ways 
depending on the codons chosen. Stop codons in the open reading frame are 

25 preferably avoided. Totally random sequence libraries may be used (NNN coding) 
but contain a proportion of stop codons (3/64 = 4.7% per codon) that may be 
unacceptably high for all but the shortest libraries. Such libraries also contain rarely 
used codons that can sometimes result in poor translation. NNG/C codons provide a 
slightly reduced stop frequency (1/32 = 3.1% per codon) while providing access to the 
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best codons for all 20 amino acids for mammalian translation systems. NNG/C 
codons are less optimal when applied in bacterial translation systems where the best 
codons end in A or T in 7 cases (AEGKRTV). Several solutions exist that provide for 
very low stop codon frequency (-1.0%), with amino acid content similar to globular 
5 proteins using three different nucleotide mixtures, NjN 2 N 3 codons (LaBean and 

Kauffman, Protein Science 2:1249-1254 (1993)) (and references therein). Finally, an 
almost infinite variety of semi-rational design strategies may be employed to pattern 
libraries according to amino acid type. For example, hydrophobic (h) or polar (p) 
amino acids can be chosen using NTN or NAN codons respectively (Beasley and 

10 Hecht, J. Biol. Chem. 272:2031-2034 (1997)). These can be patterned to give 
preference to a-helix (phpphhpp...) or P-sheet (phphph...) formation. 

ORFs constructed from synthetic sequences may also contain stop codons 
resulting from insertions or deletions in the synthetic DNA. These defects may have 
negative consequences due to alterations of the translation reading frame. 

15 Examination of a number of pools and synthetic genes constructed from synthetic 
oligonucleotides indicates that insertions and deletions occur with a frequency of 
-0.6% per position, or 1.8% per codon. The precise frequency of these occurrences is 
variable, and is thought to depend on the source and length of the synthetic DNA. In 
particular, longer sequences show a higher frequency of insertions and deletions (Haas 

20 et aL, Current Biology 6:315-324 (1996)). A simple solution to reducing frame shifts 
within the ORF is to work with relatively short segments of synthetic DNA (80 
nucleotides or less) that can be purified to homogeneity. Longer ORFs can then be 
generated by restriction and ligation of several shorter sequences. 

To optimize a selection scheme of the invention, the sequences and 

25 structures at the 5' and 3 r ends of a template may also be altered. Preferably, this is 

carried out in two separate selections, each involving the insertion of random domains 
into the template proximal to the appropriate end, followed by selection. These 
selections may serve (i) to maximize the amount of fusion made (and thus to 
maximize the complexity of a library) or (ii) to provide optimized translation 
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sequences. Further, the method may be generally applicable, combined with 
mutagenic PCR, to the optimization of translation templates both in the coding and 
non-coding regions. 

Step 2. Generation of RNA. As noted above, the RNA portion of an 
5 RNA-protein fusion may be chemically synthesized using standard techniques of 
oligonucleotide synthesis. Alternatively, and particularly if longer RNA sequences 
are utilized, the RNA portion is generated by in vitro transcription of a DNA template. 
In one preferred approach, T7 polymerase is used to enzymatically generate the RNA 
strand. Transcription is generally performed in the same volume as the PCR reaction 

10 (PCR DNA derived from a 100 |il reaction is used for 100 |il of transcription). This 
RNA can be generated with a 5 1 cap if desired using a large molar excess of m 7 GpppG 
to GTP in the transcription reaction (Gray and Hentze, EMBO J. 13:3882-3891 
(1994)). Other appropriate RNA polymerases for this use include, without limitation, 
the SP6, T3 and K coli RNA polymerases (described, for example, in Ausubel et al. 

1 5 ( supra , chapter 3). In addition, the synthesized RNA may be, in whole or in part, 

modified RNA. In one particular example, phosphorothioate RNA may be produced 
(for example, by T7 transcription) using modified ribonucleotides and standard 
techniques. Such modified RNA provides the advantage of being nuclease stable. 
Full length RNA samples are then purified from transcription reactions as previously 

20 . described using urea PAGE followed by desalting on NAP-25 (Pharmacia) (Roberts 
and Szostak, Proc. Natl. Acad. Sci. USA 94:12297-12302 (1997)). 

Step 3. Ligation of Puromvcin to the Template. Next, puromycin (or any 
other appropriate peptide acceptor) is covalently bonded to the template sequence. 
This step may be accomplished using T4 RNA ligase to attach the puromycin directly 

25 to the RNA sequence, or preferably the puromycin may be attached by way of a DNA 
"splint" using T4 DNA ligase or any other enzyme which is capable of joining 
together two nucleotide sequences (see Figure IB) (see also, for example, Ausubel et 
al., supra , chapter 3, sections 14 and 15). tRNA synthetases may also be used to 
attach puromycin-like compounds to RNA. For example, phenylalanyl tRNA 
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synthetase links phenylalanine to phenylalanyl-tRNA molecules containing a 3' amino 
group, generating RNA molecules with puromycin-like 3 f ends (Fraser and Rich, 
Proc. Natl. Acad. Sci. USA 70:2671 (1973)). Other peptide acceptors which may be 
used include, without limitation, any compound which possesses an amino acid linked 
5 to an adenine or an adenine-like compound, such as the amino acid nucleotides, 

phenylalanyl-adenosine (A-Phe), tyrosyl adenosine (A-Tyr), and alanyl adenosine (A- 
Ala), as well as amide-linked structures, such as phenylalanyl 3' deoxy 3' amino 
adenosine, alanyl 3' deoxy 3 1 amino adenosine, and tyrosyl 3' deoxy 3 T amino 
adenosine; in any of these compounds, any of the naturally-occurring L-amino acids 

1 0 or their analogs may be utilized. A number of peptide acceptors are described, for 
example, in Krayevsky and Kukhanova, Progress in Nucleic Acids Research and 
Molecular Biology 23:1 (1979). 

Step 4. Generation and Recovery of RNA-Protein Fusions. To generate 
RNA-protein fusions, any in vitro or in situ translation system may be utilized. As 

1 5 shown below, eukaryotic systems are preferred, and two particularly preferred 
systems include the wheat germ and reticulocyte lysate systems. In principle, 
however, any translation system which allows formation of an RNA-protein fusion 
and which does not significantly degrade the RNA portion of the fusion is useful in 
the invention. In addition, to reduce RNA degradation in any of these systems, 

20 degradation-blocking antisense oligonucleotides may be included in the translation 
reaction mixture; such oligonucleotides specifically hybridize to and cover sequences 
within the RNA portion of the molecule that trigger degradation (see, for example, 
Hanes and Pluckthun, Proc. Natl. Acad. Sci USA 94:4937 (1997)). 

As noted above, any number of eukaryotic translation systems are 

25 available for use in the invention. These include, without limitation, lysates from 
yeast, ascites, tumor cells (Leibowitz et al., Meth. Enzymol. 194:536 (1991)), and 
xenopus oocyte eggs. Useful in vitro translation systems from bacterial systems 
include, without limitation, those described in Zubay (Ann. Rev. Genet. 7:267 
(1973)); Chen and Zubay (Meth. Enzymol. 101:44 (1983)); and Ellman (Meth. 
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Enzymol. 202:301 (1991)). 

In addition, translation reactions may be carried out in situ . In one 
particular example, translation may be carried out by injecting mRNA into Xenopus 
eggs using standard techniques. 
5 Once generated, RNA-protein fusions may be recovered from the 

translation reaction mixture by any standard technique of protein or RNA purification. 
Typically, protein purification techniques are utilized. As shown below, for example, 
purification of a fusion may be facilitated by the use. of suitable chromatographic 
reagents such as dT 25 agarose or thiopropyl sepharose. Purification, however, may 
10 also or alternatively involve purification based upon the RNA portion of the fusion; 
techniques for such purification are described, for example in Ausubel et al. ( supra , 
chapter 4). 

Step 5. Selection of the Desired RNA-Protein Fusion. Selection of a 
desired RNA-protein fusion may be accomplished by any means available to 

15 selectively partition or isolate a desired fusion from a population of candidate fusions. 
Examples of isolation techniques include, without limitation, selective binding, for 
example, to a binding partner which is directly or indirectly immobilized on a column, 
bead, membrane, or other solid support, and immunoprecipitation using an antibody 
specific for the protein moiety of the fusion. The first of these techniques makes use 

20 of an immobilized selection motif which can consist of any type of molecule to which 
binding is possible. A list of possible selection motif molecules is presented in Figure 
2. Selection may also be based upon the use of substrate molecules attached to an 
affinity label (for example, substrate-biotin) which react with a candidate molecule, or 
upon any other type of interaction with a fusion molecule. In addition, proteins may 

25 be selected based upon their catalytic activity in a manner analogous to that described 
by Bartel and Szostak for the isolation of RNA enzymes ( supra ); according to that 
particular technique, desired molecules are selected based upon their ability to link a 
target molecule to themselves, and the functional molecules are then isolated based 
upon the presence of that target. Selection schemes for isolating novel or improved 
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catalytic proteins using this same approach or any other functional selection are 
enabled by the present invention. 

In addition, as described herein, selection of a desired RNA-protein fusion 
(or its DNA copy) may be facilitated by enrichment for that fusion in a pool of 
5 candidate molecules. To carry out such an optional enrichment, a population of 

candidate RNA-protein fusions is contacted with a binding partner (for example, one 
of the binding partners described above) which is specific for either the RNA portion 
or the protein portion of the fusion, under conditions which substantially separate the 
binding partner-fusion complex from unbound members in the sample. This step may 

1 0 be repeated, and the technique preferably includes at least two sequential enrichment 
steps, one in which the fusions are selected using a binding partner specific for the 
RNA portion and another in which the fusions are selected using a binding partner 
specific for the protein portion. In addition, if enrichment steps targeting the same 
portion of the fusion (for example, the protein portion) are repeated, different binding 

1 5 partners are preferably utilized. In one particular example described herein, a 

population of molecules is enriched for desired fusions by first using a binding partner 
specific for the RNA portion of the fusion and then, in two sequential steps, using two 
different binding partners, both of which are specific for the protein portion of the 
fusion. Again, these complexes may be separated from sample components by any 

20 standard separation technique including, without limitation, column affinity 
chromatography, centrifugation, or immunoprecipitation. 

Moreover, elution of an RNA-protein fusion from an enrichment (or 
selection) complex may be accomplished by a number of approaches. For example, 
as described herein, one may utilize a denaturing or non-specific chemical elution step 

25 to isolate a desired RNA-protein fusion. Such a step facilitates the release of complex 
components from each other or from an associated solid support in a relatively non- 
specific manner by breaking non-covalent bonds between the components and/or 
between the components and the solid support. As described herein, one exemplary 
denaturing or non-specific chemical elution reagent is 4% HOAc/H 2 0. Other 
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exemplary denaturing or non-specific chemical elution reagents include guanidine, 
urea, high salt, detergent, or any other means by which non-covalent adducts may 
generally be removed. Alternatively, one may utilize a specific chemical elution 
approach, in which a chemical is exploited that causes the specific release of a fusion 
5 molecule. In one particular example, if the linker arm of a desired fusion protein 
contains one or more disulfide bonds, bound fusion aptamers may be eluted by the 
addition, for example, of DTT, resulting in the reduction of the disulfide bond and 
release of the bound target. 

Alternatively, elution may be accomplished by specifically disrupting 

10 affinity complexes; such techniques selectively release complex components by the 
addition of an excess of one member of the complex. For example, in an ATP- 
binding selection, elution is performed by the addition of excess ATP to the 
incubation mixture. Finally, one may carry out a step of enzymatic elution. By this 
approach, a bound molecule itself or an exogenously added protease (or other 

15 appropriate hydrolytic enzyme) cleaves and releases either the target or the enzyme. 
In one particular example, a protease target site may be included in either of the 
complex components, and the bound molecules eluted by addition of the protease. 
Alternately, in a catalytic selection, elution may be used as a selection step for 
isolating molecules capable of releasing (for example, cleaving) themselves from a 

20 solid support. 

Step 6. Generation of a DNA Copy of the RNA Sequence using Reverse 
Transcriptase. If desired, a DNA copy of a selected RNA fusion sequence is readily 
available by reverse transcribing that RNA sequence using any standard technique 
(for example, using Superscript reverse transcriptase). This step may be carried out 

25 prior to the selection or enrichment step (for example, as described in Figure 16), or 
following that step. Alternatively, the reverse transcription process may be carried 
out prior to the isolation of the fusion from the in vitro or in situ translation mixture. 

Next, the DNA template is amplified, either as a partial or full-length 
double-stranded sequence. Preferably, in this step, full-length DNA templates are 
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generated, using appropriate oligonucleotides and PCR amplification. 

These steps, and the reagents and techniques for carrying out these steps, 
are now described in detail using particular examples. These examples are provided 
for the purpose of illustrating the invention, and should not be construed as limiting. 



5 GENERATION OF TEMPLATES FOR RNA-PROTEIN FUSIONS 

As shown in Figures 1 A and 2, the selection scheme of the present 
invention preferably makes use of double-stranded DNA templates which include a 
number of design elements. The first of these elements is a promoter to be used in 
conjunction with a desired RNA polymerase for mRNA synthesis. As shown in 

10 Figure 1 A and described herein, the T7 promoter is preferred, although any promoter 
capable of directing synthesis from a linear double-stranded DNA may be used. 

The second element of the template shown in Figure 1 A is termed the 5' 
untranslated region (or 5'UTR) and corresponds to the RNA upstream of the 
translation start site. Shown in Figure 1 A is a preferred 5'UTR (termed "TE") which 

15 is a deletion mutant of the Tobacco Mosaic Virus 5 T untranslated region and, in 
particular, corresponds to the bases directly 5' of the TMV translation start; the 
sequence of this UTR is as follows: rGrGrG rArCrA rArUrU rArCrU rArUrU rUrArC 
rArArU rUrArC rA (with the first 3 G nucleotides being inserted to augment 
transcription) (SEQ ID NO: 5). Any other appropriate 5' UTR may be utilized (see, 

20 for example, Kozak, Microbiol. Rev. 47:1 (1983); and Jobling et al., Nature 325:622 
(1987)). 

The third element shown in Figure 1 A is the translation start site. In 
general, this is an AUG codon. However, there are examples where codons other than 
AUG are utilized in naturally-occurring coding sequences, and these codons may also 
25 be used in the selection scheme of the invention. The precise sequence context 
surrounding this codon influences the efficiency of translation (Kozak, 
Microbiological Reviews 47:1-45 (1983); and Kozak, J. Biol. Chem. 266:19867- 
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19870 (1991))- The sequence 5'RNNAUGR provides a good start context for most 
sequences, with a preference for A as the first purine (-3), and G as the second (+4) 
(Kozak, Microbiological Reviews 47:1-45 (1983); and Kozak, J. Mol. Biol. 196:947- 
950 (1987)). 

5 The fourth element in Figure 1 A is the open reading frame of the protein 

(termed ORF), which encodes the protein sequence. This open reading frame may 
encode any naturally-occurring, random, randomized, mutagenized, or totally 
synthetic protein sequence. The most important feature of the ORF and adjacent 3 1 
constant region is that neither contain stop codons. The presence of stop codons 

1 0 would allow premature termination of the protein synthesis, preventing fusion 
formation. 

The fifth element shown in Figure 1 A is the 3' constant region. This 
sequence facilitates PCR amplification of the pool sequences and ligation of the 
puromycin-containing oligonucleotide to the mRNA. If desired, this region may also 
1 5 include a pause site, a sequence which causes the ribosome to pause and thereby 

allows additional time for an acceptor moiety (for example, puromycin) to accept a 
nascent peptide chain from the peptidyl-tRNA; this pause site is discussed in more 
detail below. 

To develop the present methodology, RNA-protein fusions were initially 
20 generated using highly simplified mRNA templates containing 1 -2 codons. This 
approach was taken for two reasons. First, templates of this size could readily be 
made by chemical synthesis. And, second, a small open reading frame allowed 
critical features of the reaction, including efficiency of linkage, end heterogeneity, 
template dependence, and accuracy of translation, to be readily assayed. 
25 Design of Construct . A basic construct was used for generating test RNA- 

protein fusions. The molecule consisted of a mRNA containing a Shine-Dalgarno 
(SD) sequence for translation initiation which contained a 3 base deletion of the SD 
sequence from ribosomal protein LI and which was complementary to 5 bases of 1 6S 
rRNA (i.e., rGrGrA rGrGrA rCrGrA rA) (SEQ ID NO: 6) (Stormo et al., Nucleic 
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Acids Research 10:2971-2996 (1982); Shine and Dalgarno, Proc. Natl. Acad. Sci. 
USA 71:1342-1346 (1974); and Steitz and Jakes, Proc. Natl. Acad. Sci. USA 
72:4734-4738 (1975)), (ii) an AUG start codon, (iii) a DNA linker to act as a pause 
site (i.e., 5'-(dA) 27 ), (iv) dCdC-3 1 , and (v) a 3' puromycin (P). The poly dA sequence 

5 was chosen because it was known to template tRNA poorly in the A site (Morgan et 
al., J. Mol. Biol. 26:477-497 (1967); Ricker and Kaji, Nucleic Acid Research 
19:6573-6578 (1991)) and was designed to act as a good pause site. The length of the 
oligo dA linker was chosen to span the -60-70 A distance between the decoding site 
and the peptidyl transfer center of the ribosome. The dCdCP mimicked the CCA end 

10 of a tRNA and was designed to facilitate binding of the puromycin to the A site of the 
ribosome. 

Comical SvntT^isof Minim al Temnlate 43-P. To synthesize construct 
43-P (shown in Figure 3), puromycin was first attached to a solid support in such a 
way that it would be compatible with standard phosphoramidite oligonucleotide 
15 synthesis chemistry. The synthesis protocol for this oligo is outlined schematically in 
Figure 3 and is described in more detail below. To attach puromycin to a controlled 
pore glass (CPG) solid support, the amino group was protected with a trifiuoroacetyl 
group as described in Applied Biosystems User Bulletin #49 for DNA synthesizer 
model 380 (1988). Next, protection of the 5' OH was carried out using a standard 
20 DMT-C1 approach (Gait, Oligonucleotide Synthesis a practical approachThe Practical 
Approach Series (IRL Press, Oxford, 1984)), and attachment to aminohexyl CPG 
through the 2' OH was effected in exactly the same fashion as the 3' OH would be 
used for attachment of a deoxynucleoside (see Fig. 3 and Gait, supra, p. 47). The 5' 
DMT-CPG-linked protected puromycin was then suitable for chain extension with 
25 phosphoramidite monomers. The synthesis of the oligo proceeded in the 3* -> 5' 

direction in the order: (i) 3' puromycin, (ii) pdCpdC, (iii) -27 units of dA as a linker, 
(iv) AUG, and (v) the Shine-Dalgamo sequence. The sequence of the 43-P construct 
is shown below. 

Synthesis of CPG Pnromvcin . The synthesis of protected CPG puromycin 
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followed the general path used for deoxynucleosides as previously outlined (Gait, 
Oligonucleotide Synthesis, A Practical Approach, The Practical Approach Series (IRL 
Press, Oxford, 1984)). Major departures included the selection of an appropriate N 
blocking group, attachment at the puromycin 2' OH to the solid support, and the 
5 linkage reaction to the solid support. In the case of the latter, the reaction was carried 
out at very low concentrations of activated nucleotide as this material was 
significantly more precious than the solid support. The resulting yield (-20 pmol/g 
support) was quite satisfactory considering the dilute reaction conditions. 

Synthesis of N-Trifluoroa r.fttv1 Pnromvcin. 267 mg (0.490 mmol) 
10 Puromycin*HCl was first converted to the free base form by dissolving in water, 
adding pH 1 1 carbonate buffer, and extracting (3X) into chloroform. The organic 
phase was evaporated to dryness and weighed (242 mg, 0.513 mmol). The free base 
was then dissolved in 1 1 ml dry pyridine and 1 1 ml dry acetonitrile, and 139 pi (2.0 
mmol) triethylamine (TEA; Fluka) and 139 ul (1.0 mmol) of trifluoroacetic anhydride 
1 5 (TFAA; Fluka) were added with stirring. TFAA was then added to the turbid solution 
in 20 pi aliquots until none of the starting material remained, as assayed by thin layer 
chromatography (tic) (93:7, Chloroform/MeOH) (a total of 280 pi). The reaction was 
allowed to proceed for one hour. At this point, two bands were revealed by thin layer 
chromatography, both of higher mobility than the starting material. Workup of the 
20 reaction with NH 4 OH and water reduced the product to a single band. Silica 

chromatography (93:7 Chloroform/MeOH) yielded 293 mg (0.515 mmol) of the 
product, N-TFA-Pur. The product of this reaction is shown schematically in Figure 4. 

Synthesis of N-Trifluoroacp.tvl 5'-DMT Pnromvcin. The product from the 
above reaction was aliquoted and coevaporated 2X with dry pyridine to remove water. 
25 Multiple tubes were prepared to test multiple reaction conditions. In a small scale 
reaction, 27.4 mg (48.2 pmoles) N-TFA-Pur was dissolved in 480 pi of pyridine 
containing 0.05 eq of DMAP and 1.4 eq TEA. To this mixture, 20.6 mg of di- 
methoxy trityl chloride (60 pmol) was added, and the reaction was allowed to proceed 
to completion with stirring. The reaction was stopped by addition of an equal volume 
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of water (approximately 500 to the solution. Because this reaction appeared 
successful , a large scale version was performed. In particular, 262 mg (0.467 mmol) 
N-TFA-Pur was dissolved in 2.4 ml pyridine followed by addition of 1.4 eq of TEA, 
0.05 eq of DMAP, and 1.2 eq of di-methoxy trityl chloride (Sigma). After 
5 approximately two hours, an additional 50 mg (0.3 eq) dirnethoxytrityl*Cl (DMT*C1) 
was added, and the reaction was allowed to proceed for 20 additional minutes. The 
reaction was stopped by the addition of 3 ml of water and coevaporated 3X with 

CH 3 CN. The reaction was purified by 95:5 Chloroform/MeOH on a 100 ml silica 

(dry) 2 mm diameter column. Due to incomplete purification, a second identical 
1 0 column was run with 97.5 :2.5 Chloroform/MeOH. The total yield was 325 mg or 

0.373 mmol (or a yield of 72%). The product of this reaction is shown schematically 

in Figure 4. 

Synthesis of N-Trifluoroacetvl. 5'-DM T. T Succinvl Puromvcin. In a 
small scale reaction, 32 mg (37 ^imol) of the product synthesized above was combined 
15 with 1.2 eq of DMAP dissolved in 350 jjiI of pyridine. To this solution, 1.2 

equivalents of succinic anhydride was added in 44 \\\ of dry CH 3 CN and allowed to 
stir overnight. Thin layer chromatography revealed little of the starting material 
remaining. In a large scale reaction, 292 mg (336 jumol) of the previous product was 
combined with 1.2 eq DMAP in 3 ml of pyridine. To this, 403 |^1 of 1M succinic 
20 anhydride (Fluka) in dry CH 3 CN was added, and the mixture was allowed to stir 
overnight. Thin layer chromatography again revealed little of the starting material 
remaining. The two reactions were combined, and an additional 0.2 eq of DMAP and 
succinate were added. The product was coevaporated with toluene IX and dried to a 
yellow foam in high vacuum. CH 2 C1 2 was added (20 ml), and this solution was 
25 extracted twice with 1 5 ml of 1 0% ice cold citric acid and then twice with pure water. 
The product was dried, redissolved in 2 ml of CH 2 C1 2 , and precipitated by addition of 
50 ml of hexane with stirring. The product was then vortexed and centrifuged at 600 
rpm for 10 minutes in the clinical centrifuge. The majority of the eluent was drawn 
off, and the rest of the product was dried, first at low vacuum, then at high vacuum in 
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a dessicator. The yield of this reaction was approximately 260 umol for a stepwise 
yield of -70%. 

Synthesis of N-Triflnoroac«tvl S'-DMT. 2J Succinvl. CPG Puromycin. The 
product from the previous step was next dissolved with 1 ml of dioxane (Fluka) 
5 followed by 0.2 ml dioxane/0.2 ml pyridine. To this solution, 40 mg of p-nitrophenol 
(Fluka) and 1 40 mg of dicyclohexylcarbodiimide (DCC; Sigma) was added, and the 
reaction was allowed to proceed for 2 hours. The insoluble cyclohexyl urea produced 
by the reaction was removed by centrifugation, and the product solution was added to 
5 g of aminohexyl controlled pore glass (CPG) suspended in 22 ml of dry DMF and 
1 0 stirred overnight. The resin was then washed with DMF, methanol, and ether, and 
dried. The resulting resin was assayed as containing 22.6 umol of trityl per g, well 
within the acceptable range for this type of support. The support was then capped by 
incubation with 15 ml of pyridine, 1 ml of acetic anhydride, and 60 mg of DMAP for 
30 minutes. The resulting column material produced a negative (no color) ninhydrin 
1 5 test, in contrast to the results obtained before blocking in which the material produced 
a dark blue color reaction. The product of this reaction is shown schematically in 
Figure 4. Alternatively, puromycin-CPG may be obtained commercially (Trilink). 

Synthesis of mRN A-Pii rnmvr.in Conjugate. As discussed above, a 
puromycin tethered oligo may be used in either of two ways to generate a 
20 mRNA-puromycin conjugate which acts as a translation template. For extremely 

short open reading frames, the puromycin oligo is typically extended chemically with 
RNA or DNA monomers to create a totally synthetic template. When longer open 
reading frames are desired, the RNA or DNA oligo is generally ligated to the 3' end of 
an mRNA using a DNA splint and T4 DNA ligase as described by Moore and Sharp 
25 (Science 256:992 (1992)). 



TN VTTRO TP ANST .ATTON AND 
TESTING. OF PNA-PROTE TN FUSTONS 
The templates generated above were translated m vitro using both bacterial 
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and eukaryotic in vitro translation systems as follows. 

Tn Vitro Translation of Minimal Templates . 43 -P and related 
RNA-puromycin conjugates were added to several different in vitro translation 
systems including: (i) the S30 system derived from E, coli MRE600 (Zubay, Ann. 
5 Rev. Genet. 7:267 (1973); Collins, Gene 6:29 (1979); Chen and Zubay, Methods 
Enzymol, 101 :44 (1983); Pratt, in Transcription and Translation: A Practical 
Approach, B. D. Hammes, S. J. Higgins, Eds. (IRL Press, Oxford, 1984) pp. 
179-209; and Ellman et aL, Methods Enzymol. 202:301 (1991)) prepared as described 
by Ellman et. al. (Methods Enzymol. 202:301 (1991)); (ii) the ribosomal fraction 
10 derived from the same strain, prepared as described by Kudlicki et al. (Anal. Chem. 
206:389 (1992)); and (iii) the S30 system derived from EL coh BL21 , prepared as 
described by Lesley et al. (J. Biol. Chem. 266:2632 (1991)). In each case, the premix 
used was that of Lesley et al. (J. Biol. Chem. 266:2632 (1991)), and the incubations 
were 30 minutes in duration. 
15 Testing the N^ire of the Fusion . The 43-P template was first tested using 

S30 translation extracts from K coh. Figure 5 (Reaction "A") demonstrates the 
desired intramolecular (cis) reaction wherein 43-P binds the ribosome and acts as a 
template for and an acceptor of fMet at the same time. The incorporation of 
35 S-methionine and its position in the template was first tested, and the results are 
20 shown in Figures 6A and 6B. After extraction of the in vitro translation reaction 

mixture with phenol/chloroform and analysis of the products by SDS-PAGE, an 35 S 
labeled band appeared with the same mobility as the 43-P template. The amount of 
this material synthesized was dependent upon the Mg 2+ concentration (Figure 6A). 
The optimum Mg 2+ concentration appeared to be between 9 and 18 mM, which was 
25 similar to the optimum for translation in this system (Zubay, Ann. Rev. Genet. 7:267 
(1973); Collins, Gene 6:29 (1979); Chen and Zubay, Methods Enzymol, 101:44 
(1983); Pratt, in Transcription and Translation: A Practical Approach, B. D. 
Hammes, S. J. Higgins, Eds. (IRL Press, Oxford, 1984) pp. 179-209; Ellman et al., 
Methods Enzymol. 202:301 (1991); Kudlicki et al., Anal. Chem. 206:389 (1992); and 
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Lesley et al., J. Biol. Chem. 266:2632 (1991)). Furthermore, the incorporated label 
was stable to treatment with NH 4 OH (Figure 6B), indicating that the label was located 
on the 3' half of the molecule (the base-stable DNA portion) and was attached by a 
base-stable linkage, as expected for an amide bond between puromycin and fMet. 
5 Ribosome and Template Dependence . To demonstrate that the reaction 

observed above occurred on the ribosome, the effects of specific inhibitors of the 
peptidyl transferase function of the ribosome were tested (Figure 6C), and the effect 
of changing the sequence coding for methionine was examined (Figure 6D). Figure 
6C demonstrates clearly that the reaction was strongly inhibited by the peptidyl 
10 transferase inhibitors, virginiamycin, gougerotin, and chloramphenicol (Monro and 
Vazquez, J. MoL Biol. 28:161-165 (1967); and Vazquez and Monro, Biochemica et 
Biophysical Acta 142:155-173 (1967)). Figure 6D demonstrates that changing a 
single base in the template from A to C abolished incorporation of 35 S methionine at 9 
mM Mg 2+ , and greatly decreased it at 18 mM (consistent with the fact that high levels 
15 of Mg 2+ allow misreading of the message). These experiments demonstrated that the 
reaction occurred on the ribosome in a template dependent fashion. 

Linker Length . Also tested was the dependence of the reaction on the 
length of the linker (Figure 6E). The original template was designed so that the linker 
spanned the distance from the decoding site (occupied by the AUG of the template) to 
20 the acceptor site (occupied by the puromycin moiety), a distance which was 

approximately the same length as the distance between the anticodon loop and the 
acceptor stem in a tRNA, or about 60-70 A. The first linker tested was 30 nucleotides 
in length, based upon a minimum of 3.4 A per base (> 102 A). In the range between 
30 and 21 nucleotides (n = 27 - 1 8; length > 102 - 71 A), little change was seen in the 
25 efficiency of the reaction. Accordingly, linker length may be varied. While a linker 
of between 21 and 30 nucleotides represents a preferred length, linkers shorter than 80 
nucleotides and, preferably, shorter than 45 nucleotides may also be utilized in the 
invention. 

Intramolecular vs. Intermolecular Reactions . Finally, we tested whether 
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the reaction occurred in an intramolecular fashion (Figure 5, Reaction "A") as desired 
or intermolecularly (Figure 5, Reaction "B"). This was tested by adding 
oligonucleotides with 3' puromycin but no ribosome binding sequence (i.e., templates 
25-P, 13-P, and 30-P) to the translation reactions containing the 43-P template 
5 (Figures 6F, 6G, and 6H). If the reaction occurred by an intermolecular mechanism, 
the shorter oligos would also be labeled. As demonstrated in Figures 6F-H, there was 
little incorporation of 35 S methionine in the three shorter oligos, indicating that the 
reaction occurred primarily in an intramolecular fashion. The sequences of 25-P 

(SEQ ID NO: 10), 13-P (SEQ ID NO: 9), and 30-P (SEQ ID NO: 8) are shown below. 
10 Reticulocyte Lvsate . Figure 6H demonstrates that 35 S-methionine may be 

incorporated in the 43-P template using a rabbit reticulocyte lysate (see below) for in 

vitro translation, in addition to the K coh lysates used above. This reaction occurred 

primarily in an intramolecular mechanism, as desired. 

SYNTHESIS AND TESTIN G OF FUSIONS 
15 CONTAINING a r-Mvr EPITOPE TAG 

Exemplary fusions were also generated which contained, within the 
protein portion, the epitope tag for the c-myc monoclonal antibody 9E10 (Evan et al., 
Mol. Cell Biol. 5:3610 (1985)). 

Design of Templates . Three initial epitope tag templates (i.e., LP77, 
20 LP 154, and Pool #1) were designed and are shown in Figures 7A-C. The first two 

templates contained the c-myc epitope tag sequence EQKLISEEDL (SEQ ID NO: 2), 
and the third template was the design used in the synthesis of a random selection pool. 
LP77 encoded a 12 amino acid sequence, with the codons optimized for bacterial 
translation. LP 154 and its derivatives contained a 33 amino acid mKNA sequence in 
25 which the codons were optimized for eukaryotic translation. The encoded amino acid 
sequence of MAEEQKLISEEDLLRKRREQKLKHKLEQLRNSCA (SEQ ID NO: 7) 
corresponded to the original peptide used to isolate the 9E10 antibody. Pool#l 
contained 27 codons of NNG/C (to generate random peptides) followed by a sequence 



WO 00/47775 



PCT/US00/02589 



38 

corresponding to the last seven amino acids of the myc peptide (which were not part 
of the myc epitope sequence). These sequences are shown below. 

Reticulocyte vs. Wheat Germ In Vitro Tran slation Systems. The 43 -P, 
LP77, and LP 154 templates were tested in both rabbit reticulocyte and wheat genu 
5 extract (Promega, Boehringer Mannheim) translation systems (Figure 8). 

Translations were performed at 30°C for 60 minutes. Templates were isolated using 
dT 25 agarose at 4°C. Templates were eluted from the agarose using 15 mM NaOH, 1 
mM EDTA, neutralized with NaOAc/HOAc buffer, immediately ethanol precipitated 
(2.5-3 vol), washed (with 100% ethanol), and dried on a speedvac concentrator. 
10 Figure 8 shows that 35 S methionine was incorporated into all three templates, in both 
the wheat germ and reticulocyte systems. Less degradation of the template was 
observed in the fusion reactions from the reticulocyte system and, accordingly, this 
system is preferred for the generation of RNA-protein fusions. In addition, in general, 
eukaryotic systems are preferred over bacterial systems. Because eukaryotic cells 
15 tend to contain lower levels of nucleases, mRNA lifetimes are generally 10-100 times 
longer in these cells than in bacterial cells. In experiments using one particular R cob 
translation system, generation of fusions was not observed using a template encoding 
the c-myc epitope; labeling the template in various places demonstrated that this was 
likely due to degradation of both the RNA and DNA portions of the template. 
20 To examine the peptide portion of these fusions, samples were treated with 

RNase to remove the coding sequences. Following this treatment, the 43-P product 
ran with almost identical mobility to the 32 P labeled 30-P oligo, consistent with a very 
small peptide (perhaps only methionine) added to 30-P. For LP77, removal of the 
coding sequence produced a product with lower mobility than the 30-P oligo, 
25 consistent with the notion that a 12 amino acid peptide was added to the puromycin. 
Finally, for LP 154, removal of the coding sequence produced a product of yet lower 
mobility, consistent with a 33 amino acid sequence attached to the 30-P oligo. No 
oligo was seen in the RNase-treated LP 154 reticulocyte lane due to a loading error. In 
Figure 9, the mobility of this product was shown to be the same as the product 
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generated in the wheat germ extract. In sum, these results indicated that RNase 
resistant products were added to the ends of the 30-P oligos, that the sizes of the 
products were proportional to the length of the coding sequences, and that the 
products were quite homogeneous in size. In addition, although both systems 
5 produced similar fusion products, the reticulocyte system appeared superior due to 
higher template stability. 

Sensitivity to KNase A and Proteinase K . In Figure 9, sensitivity to RNase 
A and proteinase K were tested using the LP 154 fusion. As shown in lanes 2-4, 
incorporation of 35 S methionine was demonstrated for the LP1 54 template. When this 
1 0 product was treated with RNase A, the mobility of the fusion decreased, but was still 
significantly higher than the 32 P labeled 30-P oligonucleotide, consistent with the 
addition of a 33 amino acid peptide to the 3' end. When this material was also treated 
with proteinase K, the 35 S signal completely disappeared, again consistent with the 
notion that the label was present in a peptide at the 3' end of the 30-P fragment. 
1 5 Similar results have been obtained in equivalent experiments using the 43-P and LP77 
fusions. 

To confirm that the template labeling by 35 S Met was a consequence of 
translation, and more specifically resulted from the peptidyl transferase activity of the 
ribosome, the effect of various inhibitors on the labeling reaction was examined. The 

20 specific inhibitors of eukaryotic peptidyl transferase, anisomycin, gougerotin, and 
sparsomycin (Vazquez, Inhibitors of Protein Biosynthesis (Springer-Verlag, New 
York), pp. 312 (1979)), as well as the translocation inhibitors cycloheximide and 
emetine (Vazquez, Inhibitors of Protein Biosynthesis (Springer-Verlag, New York), 
pp. 312 (1979)) all decreased RNA-peptide fusion formation by -95% using the long 

25 myc template and a reticulocyte lysate translation extract. 

Immunoprecipitation Experiments . In an experiment designed to illustrate 
the efficacy of immunoprecipitating an mRNA-peptide fusion, an attempt was made 
to immunoprecipitate a free c-myc peptide generated by in vitro translation. Figure 
10 shows the results of these experiments assayed on an SDS PAGE peptide gel. 
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Lanes 1 and 2 show the labeled material from translation reactions containing either 
RNA124 (the RNA portion of LP 154) or p-globin mRNA. Lanes 3-8 show the 
immunoprecipitation of these reaction samples using the c-myc monoclonal antibody 
9E10, under several different buffer conditions (described below). Lanes 3-5 show 
5 that the peptide derived from RNA124 was effectively immunoprecipitated, with the 
best case being lane 4 where -83% of the total TCA precipitable counts were isolated. 
Lanes 6-8 show little of the P-globin protein, indicating a purification of >100 fold. 
These results indicated that the peptide coded for by RNA124 (and by LP 154) can be 
quantitatively isolated by this immunoprecipitation protocol. 
10 Immunoprecipitation of the Fusion . We next tested the ability to 

immunoprecipitate a chimeric RNA-peptide product, using an LP 154 translation 
reaction and the c-myc monoclonal antibody 9E10 (Figure 1 1). The translation 
products from a reticulocyte reaction were isolated by immunoprecipitation (as 
described herein) and treated with 1 \ig of RNase A at room temperature for 30 
1 5 minutes to remove the coding sequence. This generated a 5'OH, which was 32 P 

labeled with T4 polynucleotide kinase and assayed by denaturing PAGE. Figure 1 1 
demonstrates that a product with a mobility similar to that seen for the fusion of the c- 
myc epitope with 30-P generated by RNase treatment of the LP154 fusion (see above) 
was isolated, but no corresponding product was made when only the RNA portion of 
20 the template (RNA124) was translated. In Figure 12, the quantity of fusion protein 
isolated was determined and was plotted against the amount of unmodified 30-P (not 
shown in this figure). Quantitation of the ratio of unmodified linker to linker-myc 
peptide fusion shows that 0.2 - 0.7% of the input message was converted to fusion 
product. A higher fraction of the input RNA was converted to fusion product in the 
25 presence of a higher ribosome/template ratio; over the range of input mRNA 

concentrations that were tested, approximately 0.8 - 1.0 x 10 12 fusion molecules were 
made per ml of translation extract. 

In addition, our results indicated that the peptides attached to the RNA 
species were encoded by that mRNA, i.e. the nascent peptide was not transferred to 
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the puromycin of some other mRN A. No indication of cross-transfer was seen when a 
linker (30-P) was coincubated with the long myc template in translation extracts in 
ratios as high as 20:1, nor did the presence of free linker significantly decrease the 
amount of long myc fusion produced. Similarly, co-translation of the short and long 
5 templates, 43-P and LP 154, produced only the fusion products seen when the 
templates were translated alone, and no products of intermediate mobility were 
observed, as would be expected for fusion of the short template with the long myc 
peptide. Both of these results suggested that fusion formation occurred primarily 
between a nascent peptide and mRNA bound to the same ribosome. 
1 o Se quential Isolation . As a further confirmation of the nature of the in vitro 

translated LP 154 template product, we examined the behavior of this product on two 
different types of chromatography media. Thiopropyl (TP) sepharose allows the 
isolation of a product containing a free cysteine (for example, the LP 154 product 
which has a cysteine residue adjacent to the C terminus) (Figure 13). Similarly, dT 25 
15 agarose allows the isolation of templates containing a poly dA sequence (for example, 
30-P) (Figure 13). Figure 14 demonstrates that sequential isolation on TP sepharose 
followed by dT 25 agarose produced the same product as isolation on dT 25 agarose 
alone. The fact that the in vitro translation product contained both a poly- A tract and 
a free thiol strongly indicated that the translation product was the desired 
20 RNA-peptide fusion. 

The above results are consistent with the ability to synthesize mRNA- 
peptide fusions and to recover them intact from in vitro translation extracts. The 
peptide portions of fusions so synthesized appeared to have the intended sequences as 
demonstrated by immunoprecipitation and isolation using appropriate 
25 chromatographic techniques. According to the results presented above, the reactions 
are intramolecular and occur in a template dependent fashion. Finally, even with a 
template modification of less than 1%, the present system facilitates selections based 
on candidate complexities of about 10 13 molecules. 

C-Mvc Epitope Recovery Selection . To select additional c-myc epitopes, a 
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large library of translation templates (for example, 10 15 members) is generated 
containing a randomized region (see Figure 7C and below). This library is used to 
generate ~10 12 - 10 13 fusions (as described herein) which are treated with the 
anti-c-myc antibody (for example, by immunoprecipitation or using an antibody 
5 immobilized on a column or other solid support) to enrich for c-myc-encoding 
templates in repeated rounds of in vitro selection. 

Models for Fusion Formation . Without being bound to a particular theory, 
we propose a model for the mechanism of fusion formation in which translation 
initiates normally and elongation proceeds to the end of the open reading frame. 
1 0 When the ribosome reaches the DNA portion of the template, translation stalls. At this 
point, the complex can partition between two fates: dissociation of the nascent 
peptide, or transfer of the nascent peptide to the puromycin at the 3 f -end of the 
template. The efficiency of the transfer reaction is likely to be controlled by a number 
of factors that influence the stability of the stalled translation complex and the entry of 
1 5 the 3'-puromycin residue into the A site of the peptidyl transferase center. After the 
transfer reaction, the mRNA-peptide fusion likely remains complexed with the 
ribosome since the known release factors cannot hydrolyze the stable amide linkage 
between the RNA and peptide domains. 

Both the classical model for elongation (Watson, Bull. Soc. Chim. Biol. 
20 46: 1399 (1964)) and the intermediate states model (Moazed and Noller, Nature 

342:142 (1989)) require that the A site be empty for puromycin entry into the peptidyl 
transferase center. For the puromycin to enter the empty A site, the linker must either 
loop around the outside of the ribosome or pass directly from the decoding site 
through the A site to the peptidyl transferase center. The data described herein do not 
25 clearly distinguish between these alternatives because the shortest linker tested (21 
nts) is still long enough to pass around the outside of the ribosome. In some models 
of ribosome structure (Frank et al. ? Nature 376:441 (1995)), the mRNA is threaded 
through a channel that extends on either side of the decoding site, in which case 
unthreading of the linker from the channel would be required to allow the puromycin 
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to reach the peptidyl transferase center through the A site. 

Transfer of the nascent peptide to the puromycin appeared to be slow 
relative to the elongation process as demonstrated by the homogeneity and length of 
the peptide attached to the linker. If the puromycin competed effectively with 

5 aminoacyl tRNAs during elongation, the linker-peptide fusions present in the fusion 
products would be expected to be heterogeneous in size. Furthermore, the ribosome 
did not appear to read into the linker region as indicated by the similarity in gel 
mobilities between the Met-template fusion and the unmodified linker. dA 3n should 
code for (lysine) n which would certainly decrease the mobility of the linker. The slow 

10 rate of unthreading of the mRNA may explain the slow rate of fusion formation 

relative to the rate of translocation. Preliminary results suggest that the amount of 
fusion product formed increases markedly following extended post-translation 
incubation at low temperature, perhaps because of the increased time available for 
transfer of the nascent peptide to the puromycin. 

15 DETAILED MATERIALS AND METHODS 

Described below are detailed materials and methods relating to the in vitro 
translation and testing of RNA-protein fusions, including fusions having a myc 
epitope tag. 

Sequences. A number of oligonucleotides were used above for the 
20 generation of RNA-protein fusions. These oligonucleotides have the following 
sequences. 

NAME SEQUENCE 

30-P 5 'AAA AAA AAA AAA AAA AAA AAA AAA AAA CCP (SEQ ID 
NO:8) 

25 13-P 5' AAA AAA AAA ACC P (SEQ ID NO: 9) 



25-P 



5'CGC GGT TTT TAT TTT TTT TTT TCC P (SEQ ID NO: 10) 
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43-P 5 'rGrGr A rGrGr A rCrGr A r Ar ArU rGAA AAA AA A AAA AAA AAA 
AAA AAA AAA ACC P (SEQ ID NO: 11) 

43-P [CUG] 5'rGrGrA rGrGrA rCrGrA rArCrU rGAA AAA AAA AAA AAA 
AAA AAA AAA AAA ACC P (SEQ ID NO: 12) 

40-P 5'rGrGrA rGrGrA rCrGrA rArCrU rGAA AAA AAA AAA AAA AAA 
AAA AAA ACC P (SEQ ID NO: 13) 

37-P 5'rGrGrA rGrGrA rCrGrA rArCrU rGAA AAA AAA AAA AAA AAA 
AAA ACC P (SEQ ID NO: 14) 

34-P 5'rGrGrA rGrGrA rCrGrA rArCrU rGAA AAA AAA AAA AAA AAA 
ACC P (SEQ ID NO: 15) 

3 1 -P 5'rGrGrA rGrGrA rCrGrA rArCrU rGAA AAA AAA AAA AAA ACC P 
(SEQ ID NO: 16) 

LP77 5'rGrGrG rArGrG rArCrG rArArA rUrGrG rArArC rArGrA rArArC 
rUrGrA rUrCrU rCrUrG rArArG rArArG rArCrC rUrGrA rArC AAA AAA AAA 
AAA AAA AAA AAA AAA AAA CCP (SEQ ID NO: 1) 

LP154 5'rGrGrG rArCrA rArUrU rArCrU rArUrU rUrArC rArArU rUrArC rA 
rArUrG rGrCrU rGrArA rGrArA rCrArG rArArA rCrUrG rArUrC rUrCrU rGrArA 
rGrArA rGrArC rCrUrG rCrUrG rCrGrU rArArA rCrGrU rCrGrU rGrArA rCrArG 
rCrUrG rArArA rCrArC rArArA rCrUrG rGrArA rCrArG rCrUrG rCrGrU rArArC 
rUrCrU rUrGrC rGrCrU AAA AAA AAA AAA AAA AAA AAA AAA AAA CCP 
(SEQ ID NO: 3) 
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LP 1 60 5' 5'rGrGrG rArCrA rArUrU rArCrU rArUrU rUrArC rArArU rUrArC rA 
rArUrG rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS 
rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS 
rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rCrArG rCrUrG 
5 rCrGrU rArArC rUrCrU rUrGrC rGrCrU AAA AAA AAA AAA AAA AAA AAA 
AAA AAA CCP (SEQ ID NO: 17) 

All oligonucleotides are listed in the 5' to 3' direction. Ribonucleotide bases are 
indicated by lower case "r" prior to the nucleotide designation; P is puromycin; rN 
indicates equal amounts of rA, rG, rC, and rU; rS indicates equal amounts of rG and 
10 rC; and all other base designations indicate DNA oligonucleotides. 

Chemicals. Puromycin HC1, long chain alkylamine controlled pore glass, 
gougerotin, chloramphenicol, virginiamycin, DMAP, dimethyltrityl chloride, and 
acetic anhydride were obtained from Sigma Chemical (St. Louis, MO). Pyridine, 
dimethylformamide, toluene, succinic anhydride, and para-nitrophenol were obtained 
1 5 from Fluka Chemical (Ronkonkoma, NY). Beta-globin mRNA was obtained from 
Novagen (Madison, WI). TMV RNA was obtained from Boehringer Mannheim 
(Indianapolis, IN). 

Enzymes. Proteinase K was obtained from Promega (Madison, WI). 
DNase-free RNAase was either produced by the protocol of Sambrook et al. (supra) or 
20 purchased from Boehringer Mannheim. T7 polymerase was made by the published 

protocol of Grodberg and Dunn (J. Bacteriol. 170: 1245 (1988)) with the modifications 
of Zawadzki and Gross (Nucl. Acids Res. 19:1948 (1991)). T4 DNA ligase was 
obtained from New England Biolabs (Beverly, MA). 

Quantitation of Radiolahel Incorporation. For radioactive gels bands, the 
25 amount of radiolabel ( 35 S or 32 P) present in each band was determined by quantitation 
either on a Betagen 603 blot analyzer (Betagen, Waltham, MA) or using 
phosphorimager plates (Molecular Dynamics, Sunnyvale, CA). For liquid and solid 
samples, the amount of radiolabel ( 35 S or 32 P) present was determined by scintillation 
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counting (Beckman, Columbia, MD). 

Gel Images. Images of gels were obtained by autoradiography (using 
Kodak XAR film) or using phosphorimager plates (Molecular Dynamics). 

Synthesis of CPG Puromvcin. Detailed protocols for synthesis of 
5 CPG-puromycin are outlined above. 

Enzvmatic Reactions. In general, the preparation of nucleic acids for 
kinase, transcription, PCR, and translation reactions using K coh extracts was the 
same. Each preparative protocol began with extraction using an equal volume of 1 : 1 
phenol/chloroform, followed by centrifugation and isolation of the aqueous phase. 
10 Sodium acetate (pH 5.2) and spermidine were added to a final concentration of 300 
mM and 1 mM respectively, and the sample was precipitated by addition of 3 
volumes of 100% ethanol and incubation at -70 °C for 20 minutes. Samples were 
centrifuged at > 12,000 g, the supernatant was removed, and the pellets were washed 
with an excess of 95% ethanol, at 0°C. The resulting pellets were then dried under 
15 vacuum and resuspended. 

Oligonucleotides. All synthetic DNA and RNA was synthesized on a 
Millipore Expedite synthesizer using standard chemistry for each as supplied from the 
manufacturer (Milligen, Bedford, MA). Oligonucleotides containing 3 ! puromycin 
were synthesized using CPG puromycin columns packed with 30-50 mg of solid 
20 support (-20 jamole puromycin/gram). Oligonucleotides containing a 3 f biotin were 
synthesized using 1 jimole bioteg CPG columns from Glen Research (Sterling, VA). 
Oligonucleotides containing a 5' biotin were synthesized by addition of bioteg 
phosphoramidite (Glen Research) as the 5' base. Oligonucleotides to be ligated to the 
3' ends of RNA molecules were either chemically phosphorylated at the 5' end (using 
25 chemical phosphorylation reagent from Glen Research) prior to deprotection or 
enzymatically phosphorylated using ATP and T4 polynucleotide kinase (New 
England Biolabs) after deprotection. Samples containing only DNA (and 3' 
puromycin or 3' biotin) were deprotected by addition of 25% NH 4 OH followed by 
incubation for 12 hours at 55°C. Samples containing RNA monomers (e.g., 43-P) 
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were deprotected by addition of ethanol (25% (v/v)) to the NH 4 OH solution and 
incubation for 12 hours at 55 °C. The 2'OH was deprotected using 1M TBAF in THF 
(Sigma) for 48 hours at room temperature. TBAF was removed using a NAP-25 
Sephadex column (Pharmacia, Piscataway, NJ). 
5 If desired, to test for the presence of 3' hydroxyl groups, the puromycin 

oligonucleotide may be radiolabeled at the 5' end using T4 polynucleotide kinase and 
then used as a primer for extension with terminal deoxynucleotidyl transferase. The 
presence of the primary amine in the puromycin may be assayed by reaction with 
amine derivatizing reagents such as NHS-LC-biotin (Pierce). Oligonucleotides, such 
10 as 30-P, show a detectable mobility shift by denaturing PAGE upon reaction, 

indicating quantitative reaction with the reagent. Oligonucleotides lacking puromycin 
do not react with NHS-LC-biotin and show no change in mobility. 

Deprotected DNA and RNA samples were then purified using denaturing 
PAGE, followed by either soaking or electro-eluting from the gel using an Elutrap 
15 (Schleicher and Schuell, Keene, NH) and desalting using either a NAP-25 Sephadex 
column or ethanol precipitation as described above. 

Mvc DNA construction. Two DNA templates containing the c-myc 
epitope tag were constructed. The first template was made from a combination of the 
oligonucleotides 64.27 (5 T -GTT CAG GTC TTC TTG AGA GAT CAG TTT CTG 
20 TTC CAT TTC GTC CTC CCT ATA GTG AGT CGT ATT A-3') (SEQ ID NO: 1 8) 
and 18.109 (5'-TAA TAC GAC TCA CTA TAG-3') (SEQ ID NO: 19). Transcription 
using this template produced RNA 47.1 which coded for the peptide 
MEQKLISEEDLN (SEQ ID NO: 20). Ligation of RNA 47.1 to 30-P yielded LP77 
shown in Figure 7A. 

25 The second template was made first as a single oligonucleotide 99 bases in 

length, having the designation RWR 99.6 and the sequence 5»AGC GCA AGA GTT 
ACG CAG CTG TTC CAG TTT GTG TTT CAG CTG TTC ACG ACG TTT ACG 
CAG CAG GTC TTC TTC AGA GAT CAG TTT CTG TTC TTC AGC CAT-3' 
(SEQ ID NO: 21). Double stranded transcription templates containing this sequence 



WO 00/47775 



PCT/US00/02589 



48 

were constructed by PCR with the oligos RWR 21 .103 (5'-AGC GCA AGA GTT 
ACG CAG CTG-3') (SEQ ID NO: 22) and RWR 63.26 (5'TAA TAC GAC TCA 
CTA TAG GGA CAA TTA CTA TTT ACA ATT ACA ATG GCT GAA GAA CAG 
AAA CTG-3') (SEQ ID NO: 23) according to published protocols (Ausubel et al., 
5 supra , chapter 1 5). Transcription using this template produced an RNA referred to as 
RNA124 which coded for the peptide 

MAEEQKLISEEDLLRKRREQLKHKLEQLRNSCA (SEQ ID NO: 24). This 
peptide contained the sequence used to raise monoclonal antibody 9E10 when 
conjugated to a carrier protein (Oncogene Science Technical Bulletin). RNA124 was 

10 124 nucleotides in length, and ligation of RNA124 to 30-P produced LP154 shown in 
Figure 7B. The sequence of RNA 124 is as follows (SEQ ID NO: 32): 
5'-rGrGrG rArCrA rArUrU rArCrU rArUrU rUrArC rArArU rUrArC rArArUrG 
rGrCrU rGrArA rGrArA rCrArG rArArA rCrUrG rArUrC rUrCrU rGrArA rGrArA 
rGrArC rCrUrG rCrUrG rCrGrU rArArA rCrGrU rCrGrU rGrArA rCrArG rCrUrG 

1 5 rArArA rCrArC rArArA rCrUrG rGrArA rCrArG rCrUrG rCrGrU rArArC rUrCrU 
rUrGrC rGrCrU-3' 

Randomized Pool Construction. The randomized pool was constructed as 
a single oligonucleotide 130 bases in length denoted RWR130.1. Beginning at the 3' 
end, the sequence was 3' CCCTGTTAATGATAAATGTTAATGTTAC (NNS) 27 

20 GTC GAC GCA TTG AGA TAC CGA-5' (SEQ ID NO: 25). N denotes a random 
position, and this sequence was generated according to the standard synthesizer 
protocol. S denotes an equal mix of dG and dC bases. PCR was performed with the 
oligonucleotides 42.108 (5'-TAA TAC GAC TCA CTA TAG GGA CAA TTA CTA 
TTT ACA ATT ACA) (SEQ ID NO: 26) and 21.103 (5'-AGC GCA AGA GTT ACG 

25 CAG CTG) (SEQ ID NO: 27). Transcription off this template produced an RNA 

denoted pool 130.1. Ligation of pool 130.1 to 30-P yielded Pool #1 (also referred to as 
LP 160) shown in Figure 7C. 

Seven cycles of PCR were performed according to published protocols 
(Ausubel et al., supra ) with the following exceptions: (i) the starting concentration of 
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RWR130.1 was 30 nanomolar, (ii) each primer was used at a concentration of 1.5 pM, 
(iii) the dNTP concentration was 400 jxM for each base, and (iv) the Taq polymerase 
(Boehringer Mannheim) was used at 5 units per 1 00 jjlI. The double stranded product 
was purified on non-denaturing PAGE and isolated by electroelution. The amount of 

5 DNA was determined both by UV absorbance at 260 nm and ethidium bromide 
fluorescence comparison with known standards. 

Enzvmatic Synthesis of RNA. Transcription reactions from double 
stranded PCR DNA and synthetic oligonucleotides were performed as described 
. previously (Milligan and Uhlenbeck, Meth. Enzymol. 180:51 (1989)). Full length 

10 RNA was purified by denaturing PAGE, electroeluted, and desalted as described 

above. The pool RNA concentration was estimated using an extinction coefficient of 
1300 O.D./^mole; RNA124, 1250 O.D./^imole; RNA 47.1, 480 O.D./jumole. 
Transcription from the double stranded pool DNA produced ~ 90 nanomoles of pool 
RNA. 

15 Enzvmatic Synthesis of RNA-Puromvcin Conjugates. Ligation of the 

myc and pool messenger RNA sequences to the puromycin containing oligonucleotide 
was performed using a DNA splint, termed 19.35 (5 f -TTT TTT TTT TAG CGC AAG 
A) (SEQ ID NO: 28) using a procedure analogous to that described by Moore and 
Sharp (Science 250:992 (1992)). The reaction consisted of mRNA, splint, and 

20 puromycin oligonucleotide (30-P, dA27dCdCP) in a mole ratio of 0.8 : 0.9 : 1.0 and 
1-2.5 units of DNA ligase per picomole of pool mRNA. Reactions were conducted 
for one hour at room temperature. For the construction of the pool RNA fusions, the 
mRNA concentration was ~ 6.6 |nmolar. Following ligation, the RNA-puromycin 
conjugate was prepared as described above for enzymatic reactions. The precipitate 

25 was resuspended, and full length fusions were purified on denaturing PAGE and 
isolated by electroelution as described above. The pool RNA concentration was 
estimated using an extinction coefficient of 1650 O.D./|imole and the myc template 
1600 O.D./jumole. In this way, 2.5 nanomoles of conjugate were generated. 

Preparation of dT 2 5 Streptavidin Agarose. dT 25 containing a 3' biotin 



WO 00/47775 



PCT/USOO/02589 



50 

(synthesized on bioteg phosphoramidite columns (Glen Research)) and desalted on a 
NAP-25 column (Pharmacia) was incubated at 1-10 jiM or even 1-20 |LtM with a 
slurry of streptavidin agarose (50% agarose by volume, Pierce, Rockford, IL) for 1 
hour at room temperature in TE (10 mM Tris Chloride pH 8.2, 1 mM EDTA) and 
5 washed. The binding capacity of the agarose was then estimated optically by the 

disappearance of biotin-dT 25 from solution and/or by titration of the resin with known 
amounts of complementary oligonucleotide. 

Translation Reactions using E. coli Derived Extracts and Ribosomes. In 
general, translation reactions were performed with purchased kits (for example, R coh 
10 S30 Extract for Linear Templates, Promega, Madison, WI). However, K coh 

MRE600 (obtained from the ATCC, Rockville, MD) was also used to generate S30 
extracts prepared according to published protocols (for example, Ellman et al., Meth. 
Enzymol. 202:301(1991)), as well as aribosomal fraction prepared as described by 
Kudlicki et al. (Anal. Biochem. 206:389 (1992)). The standard reaction was 
15 performed in a 50 ^1 volume with 20-40 juCi of 35 S methionine as a marker. The 

reaction mixture consisted of 30% extract v/v, 9-18 mM MgCl 2 , 40% premix minus 
methionine (Promega) v/v, and 5 \xM of template (e.g., 43-P). For coincubation 
experiments, the oligos 13-P and 25-P were added at a concentration of 5 )iM. For 
experiments using ribosomes, 3 jliI of ribosome solution was added per reaction in 
20 place of the lysate. All reactions were incubated at 37°C for 30 minutes. Templates 
were purified as described above under enzymatic reactions. 

Wheat Germ Translation Reactions. The translation reactions in Figure 8 
were performed using purchased kits lacking methionine (Promega), according to the 
manufacturer's recommendations. Template concentrations were 4 for 43-P and 
25 0.8 tiM for LP77 and LP1 54. Reactions were performed at 25 °C with 30 |LiCi 35 S 
methionine in a total volume of 25 

Reticulocyte Translation Reactions. Translation reactions were performed 
either with purchased kits (Novagcn, Madison, WI) or using extract prepared 
according to published protocols (Jackson and Hunt, Meth. Enzymol. 96:50 (1983)). 
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Reticulocyte-rich blood was obtained from Pel-Freez Biologicals (Rogers, AK). In 
both cases, the reaction conditions were those recommended for use with Red Nova 
Lysate (Novagen). Reactions consisted of 100 mM KC1, 0.5 mM MgOAc, 2 mM 
DTT, 20 mM HEPES pH 7.6, 8 mM creatine phosphate, 25 |nM in each amino acid 
5 (with the exception of methionine if 35 S Met was used), and 40% v/v of lysate. 
Incubation was at 30 °C for 1 hour. Template concentrations depended on the 
experiment but generally ranged from 50 nM to 1 |iM with the exception of 43-P 
(Figure 6H) which was 4 nM. 

For generation of the randomized pool, 10 ml of translation reaction was 
10 performed at a template concentration of ~ 0. 1 \iM (1 .25 nanomoles of template). In 
addition, 32 P labeled template was included in the reaction to allow determination of 
the amount of material present at each step of the purification and selection procedure. 
After translation at 30° C for one hour, the reaction was cooled on ice for 30-60 
minutes. 

15 Isolation of Fusion with dT. c Streptavidin Agarose or Oligo dT Cellulose. 

After incubation, the translation reaction was diluted approximately 150 fold into 
isolation buffer (1.0 M NaCl, 0.1 M Tris chloride pH 8.2, 10 mM EDTA, and either 1 
mM DTT or 0.2% Triton X-100) containing greater than a 10X molar excess of dT 25 - 
biotin-streptavidin agarose whose dT 25 concentration was ~ 10 |uM (volume of slurry 

20 equal or greater than the volume of lysate) or oligo dT cellulose (Pharmacia), and 

incubated with agitation at 4°C for one hour. The agarose was then removed from the 
mixture either by filtration (Millipore ultrafree MC filters) or centrifugation and 
washed with cold isolation buffer 2-4 times. The template was then liberated from the 
dT 25 streptavidin agarose by repeated washing with 50-100 jal aliquots of 15 mM 

25 NaOH, 1 mM EDTA at 4°C, or pure water at room temperature. The eluent was 

immediately neutralized in 3M NaOAc pH 5.2, 10 mM spermidine, and was ethanol 
precipitated or used directly for the next step of purification. For the pool reaction, 
the total radioactivity recovered indicated approximately 50-70% of the input 
template was recovered. 
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Isolation of Fusion with Thiopropyl Sepharose. Fusions containing 
cysteine can be purified using thiopropyl sepharose 6B as in Figure 13 (Pharmacia). 
In the experiments described herein, isolation was either carried out directly from the 
translation reaction or following initial isolation of the fusion (e.g., with streptavidin 
5 agarose). For samples purified directly, a ratio of 1 : 1 0 (v/v) lysate to sepharose was 
used. For the pool, 0.5 ml of sepharose slurry was used to isolate all of the fusion 
material from 5 ml of reaction mixture. Samples were diluted into a 50:50 (v/v) slurry 
of thiopropyl sepharose in IX TE 8.2 (10 mM Tris-Cl, 1 mM EDTA, pH 8.2) 
containing DNase free RNase (Boehringer Mannheim) and incubated with rotation for 
10 1-2 hours at 4°C to allow complete reaction. The excess liquid was removed, and the 
sepharose was washed repeatedly with isolation buffer containing 20 mM DTT and 
recovered by centrifugation or filtration. The fusions were eluted from the sepharose 
using a solution of 25-30 mM dithiothreitol (DTT) in 10 mM Tris chloride pH 8.2, 1 
mM EDTA. The fusion was then concentrated by a combination of evaporation under 
15 high vacuum, ethanol precipitation as described above, and, if desired, analyzed by 

SDS-Tricine-PAGE. For the pool reaction, the total radioactivity recovered indicated 
approximately 1% of the template was converted to fusion. 

For certain applications, dT 25 was added to this eluate and rotated for 1 
hour at 4°C. The agarose was rinsed three times with cold isolation buffer, isolated 
20 via filtration, and the bound material eluted as above. Carrier tRNA was added, and 
the fusion product was ethanol precipitated. The sample was resuspended in TE pH 
8.2 containing DNase free RNase A to remove the RNA portion of the template. 

Immunoprecipitation Reactions. Immunoprecipitations of peptides from 
translation reactions (Figure 10) were performed by mixing 4 \x\ of reticulocyte 
25 translation reaction, 2 [i\ normal mouse sera, and 20 |il Protein G + A agarose 

(Calbiochem, La Jolla, CA) with 200 \i\ of either PBS (58 mM Na 2 HP0 4 , 17 mM 
NaH 2 P0 4 , 68 mM NaCl), dilution buffer (10 mM Tris chloride pH 8.2, 140 mM NaCl, 
1% v/v Triton X-100), or PBSTDS (PBS + 1% Triton X-100, 0.5% deoxycholate 
0.1% SDS). Samples were then rotated for one hour at 4°C, followed by 
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centrifugation at 2500 rpm for 15 minutes. The eluent was removed, and 10 ^1 of 
c-myc monoclonal antibody 9E10 (Calbiochem, La Jolla, CA) and 15 |xl of Protein G 
+ A agarose was added and rotated for 2 hours at 4°C. Samples were then washed 
with two 1 ml volumes of either PBS, dilution buffer, or PBSTDS. 40 |Ltl of gel 
5 loading buffer (Calbiochem Product Bulletin) was added to the mixture, and 20 jal 
was loaded on a denaturing PAGE as described by Schagger and von Jagow (Anal. 
Biochem. 166:368(1987)). 

Immunoprecipitations of fusions (as shown in Figure 11) were performed 
by mixing 8 j^l of reticulocyte translation reaction with 300 p.1 of dilution buffer (10 
10 mM Tris chloride pH 8.2, 140 mM NaCl, 1% v/v Triton X-100), 15 jil protein G 

sepharose (Sigma), and 10 jlxI (1 |ng) c-myc antibody 9E10 (Calbiochem), followed by 
rotation for several hours at 4°C. After isolation, samples were washed, treated with 
DNase free RNase A, labeled with polynucleotide kinase and 32 P gamma ATP, and 
separated by denaturing urea PAGE (Figure 1 1). 
15 Reverse Transcription of Fusion Pool. Reverse transcription reactions 

were performed according to the manufacturers recommendation for Superscript II, 
except that the template, water, and primer were incubated at 70 °C for only two 
minutes (Gibco BRL, Grand Island, NY). To monitor extension, 50 ^iCi alpha 32 P 
dCTP was included in some reactions; in other reactions, reverse transcription was 
20 monitored using 5' 32 P-labeled primers which were prepared using 32 P a ATP (New 

England Nuclear, Boston, MA) and T4 polynucleotide kinase (New England Biolabs, 
Beverly, MA). 

Preparation of Protein G and Antibody Sepharose. Two aliquots of 50 pi 
Protein G sepharose slurry (50 % solid by volume) (Sigma) were washed with 
25 dilution buffer (10 mM Tris chloride pH 8.2, 140 mM NaCl, 0.025% NaN 3 , 1% v/v 

Triton X-100) and isolated by centrifugation. The first aliquot was reserved for use as 
a precolumn prior to the selection matrix. After resuspension of the second aliquot in 
dilution buffer, 40 |Lig of c-myc AB-1 monoclonal antibody (Oncogene Science) was 
added, and the reaction incubated overnight at 4°C with rotation. The antibody 
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sepharose was then purified by centrifugation for 15 minutes at 1500-2500 rpm in a 
microcentrifuge and washed 1-2 times with dilution buffer. 

Selection. After isolation of the fusion and complementary strand 
synthesis, the entire reverse transcriptase reaction was used directly in the selection 
5 process. Two protocols are outlined here. For round one, the reverse transcriptase 
reaction was added directly to the antibody sepharose prepared as described above and 
incubated 2 hours. For subsequent rounds, the reaction is incubated ~2 hours with 
washed protein G sepharose prior to the antibody column to decrease the number of 
binders that interact with protein G rather than the immobilized antibody. 
10 To elute the pool from the matrix, several approaches may be taken. The 

first is washing the selection matrix with 4% acetic acid. This procedure liberates the 
peptide from the matrix. Alternatively, a more stringent washing (e.g., using urea or 
another denaturant) may be used instead or in addition to the acetic acid approach. 

PCR of Selected Fusions. Selected molecules are amplified by PCR using 
15 standard protocols (for example, Fitzwater and Polisky, Meth. Enzymol. 267:275 
(1996); and Conrad et al., Meth. Enzymol. 267:336 (1996)), as described above for 
construction of the pool. Performing PCR controls at this step may be desirable to 
assure that the amplified pool results from the selection performed. Primer purity is 
of central importance. The pairs should be amplified in the absence of input template, 
20 as contamination with pool sequences or control constructs can occur. New primers 
should be synthesized if contamination is found. The isolated fusions should also be 
subjected to PCR prior to the RT step to assure that they are not contaminated with 
cDNA. Finally, the number of cycles needed for PCR reactions before and after 
selection should be compared. Large numbers of cycles needed to amplify a given 
25 sequence (>25-30 rounds of PCR) may indicate failure of the RT reaction or problems 
with primer pairs. 



SYNTHESIS AND TESTING OF BETA-GLOBIN FUSIONS 

To synthesize a p-globin fusion construct, P-globin cDNA was generated 
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from 2.5 pg globin mRNA by reverse transcription with 200 pmoles of primer 18.155 
(5' GTG GTA TTT GTG AGC CAG) (SEQ ID NO: 29) and Superscript reverse 
transcriptase (Gibco BRL) according to the manufacturer's protocol. The primer 
sequence was complementary to the 18 nucleotides of p-globin 5' of the stop codon. 
To add a T7 promoter, 20 pi of the reverse transcription reaction was removed and 
subjected to 6 cycles of PCR with primers 18.155 and 40.54 (5' TAA TAC GAC TCA 
CTA TAG GGA CAC TTG CTT TTG ACA CAA C) (SEQ ID NO: 30). The 
resulting "syn-p-globin" mRNA was then generated by T7 runoff transcription 
according to Milligan and Uhlenbeck (Methods Enzymol. 180:51 (1989)), and the 
RNA gel purified, electroeluted, and desalted as described herein. "LP-p-globin" was 
then generated from the syn-P-globin construct by ligation of that construct to 30-P 
according to the method of Moore and Sharp (Science 256:992 (1992)) using primer 
20.262 (5' TTT TTT TTT T GTG GTA TTT G) (SEQ ID NO: 31) as the splint. The 
product of the ligation reaction was then gel purified, electroeluted, and desalted as 
above. The concentration of the final product was determined by absorbance at 260 
nm. 

These P-globin templates were then translated in vitro as described in Table 
1 in a total volume of 25 pi each. Mg 2+ was added from a 25 mM stock solution. All 
reactions were incubated at 30 °C for one hour and placed at -20 °C overnight. dT 25 
precipitable CPM's were then determined twice using 6 pi of lysate and averaged 
minus background. 



TABLE 1 

Translation Reactions with Beta-Globin Templates 

Reaction Template Mg 2+ 35 S Met TCA CPM dT 25 CPM 

(mM) (pi) (2 pi) (6 pi) 

1 — 1.0 2.0(20 pCi) 3312 0 

2 2.5 pg 0.5 2.0 (20 pCi) 33860 36 
syn-(3-globin 

3 2.5 pg 1.0 2.0(20 pCi) 22470 82 
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2.0 (20 [iCi) 
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To prepare the samples for gel analysis, 6 |ul of each translation reaction 
was mixed with 1000 jitl of Isolation Buffer (1 M NaCl, 100 mM Tris-Cl pH 8.2, 10 
mM EDTA, 0. 1 mM DTT), 1 |nl RNase A (DNase Free, Boehringer Mannheim), and 
20 jal of 20 jliM dT 25 streptavidin agarose. Samples were incubated at 4°C for one 
hour with rotation. Excess Isolation Buffer was removed, and the samples were added 
to a Millipore MC filter to remove any remaining Isolation Buffer. Samples were 
then washed four times with 50 \i\ of H 2 0, and twice with 50 |ixl of 15 mM NaOH, 1 
mM EDTA. The sample (300 ^1) was neutralized with 100 jul TE pH 6.8 (10 mM 
Tris-Cl, 1 mM EDTA), 1 |ul of 1 mg/ml RNase A (as above) was added, and the 
samples were incubated at 37°C. 10 ^1 of 2X SDS loading buffer (125 mM Tris-Cl 
pH 6.8, 2% SDS, 2% P-mercaptoethanol 20% glycerol, 0.001% bromphenol blue) was 
then added, and the sample was lyophilized to dryness and resuspended in 20 ^il H 2 0 
and 1% p-mercaptoethanol. Samples were then loaded onto a peptide resolving gel as 
described by Schagger and von Jagow (Analytical Biochemistry 166:368 (1987)) and 
visualized by autoradiography. 

The results of these experiments are shown in Figures 15A and 15B. As 
indicated in Figure 15 A, 35 S-methionine was incorporated into the protein portion of 
the syn-P-globin and LP-P-globin fusions. The protein was heterogeneous, but one 
strong band exhibited the mobility expected for P-globin mRNA. Also, as shown in 
Figure 15B, after dT 25 isolation and RNase A digestion, no 35 S-labeled material 
remained in the syn-P-globin lanes (Figure 15B, lanes 2-4). In contrast, in the 
LP-P-globin lanes, a homogeneously sized 35 S-labeled product was observed. 



WO 00/47775 



PCT/USOO/02589 



57 

These results indicated that, as above, a fusion product was isolated by 
oligonucleotide affinity chromatography only when the template contained a 3' 
puromycin. This was confirmed by scintillation counting (see Table 1). The material 
obtained is expected to contain the 30-P linker fused to some portion of P-globin. The 
5 fusion product appeared quite homogeneous in size as judged by gel analysis. 
However, since the product exhibited a mobility very similar to natural (3-globin 
(Figures 15A and 15B, control lanes), it was difficult to determine the precise length 
of the protein portion of the fusion product. 

FURTHER OPTIMIZATION OF RNA-PRQ TFTN FUSION FORMATION 
10 Certain factors have been found to further increase the efficiency of 

formation of RNA-peptide fusions. Fusion formation, i.e., the transfer of the nascent 
peptide chain from its tRNA to the puromycin moiety at the 3' end of the mRNA, is a 
slow reaction that follows the initial, relatively rapid translation of the open reading 
frame to generate the nascent peptide. The extent of fusion formation may be 
15 substantially enhanced by a post-translational incubation in elevated Mg 2+ conditions 
(preferably, in a range of 50-100 mM) and/or by the use of a more flexible linker 
between the mRNA and the puromycin moiety. In addition, long incubations (12-48 
hours) at low temperatures (preferably, 

-20 °C) also result in increased yields of fusions with less mRNA degradation than 
20 that which occurs during incubation at 30°C. By combining these factors, up to 40% 
of the input mRNA may be converted to mRNA-peptide fusion products, as shown 
below. 

Synthesis of mRN A-Puromvcin Conjugates . In these optimization 
experiments, puromycin-containing linker oligonucleotides were ligated to the 3' ends 
25 of mRNAs using bacteriophage T4 DNA ligase in the presence of complementary 
DNA splints, generally as described above. Since T4 DNA ligase prefers precise 
base-pairing near the ligation junction and run-off transcription products with T7, T3, 
or SP6 RNA polymerase are often heterogeneous at their 3' ends (Nucleic Acids 
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Research 15:8783 (1987)), only those RNAs containing the correct 3'-terminal 
nucleotide were efficiently ligated. When a standard DNA splint was used, 
approximately 40% of runoff transcription products were ligated to the puromycin 
oligo. The amount of ligation product was increased by using excess RNA, but was 
5 not increased using excess puromycin oligonucleotide. Without being bound to a 
particular theory, it appeared that the limiting factor for ligation was the amount of 
RNA which was fully complementary to the corresponding region of the DNA splint. 

To allow ligation of those transcripts ending with an extra non-templated 
nucleotide at the 3' terminus (termed "N+l products"), a mixture of the standard DNA 
10 splint with a new DNA splint containing an additional random base at the ligation 
junction was used. The ligation efficiency increased to more than 70% for an 
exemplary myc RNA template (that is, RNA 124) in the presence of such a mixed 
DNA splint. 

In addition to this modified DNA splint approach, the efficiency of 
15 mRNA-puromycin conjugate formation was also further optimized by taking into 
account the following three factors. First, mRNAs were preferably designed or 
utilized which lacked 3'-termini having any significant, stable secondary structure that 
would interfere with annealing to a splint oligonucleotide. In addition, because a high 
concentration of salt sometimes caused failure of the ligation reaction, thorough 
20 desalting of the oligonucleotides using NAP -25 columns was preferably included as a 
step in the procedure. Finally, because the ligation reaction was relatively rapid and 
was generally complete within 40 minutes at room temperature, significantly longer 
incubation periods were not generally utilized and often resulted in unnecessary 
degradation of the RNA. 
25 Using the above conditions, mRNA-puromycin conjugates were 

synthesized as follows. Ligation of the myc RNA sequence (RNA124) to the 
puromycin-containing oligonucleotide was performed using either a standard DNA 
splint (e.g., 5-TTTTTTTTTTAGCGCAAGA) (SEQ ID NO: 32) or a splint 
containing a random base (N) at the ligation junction (e.g., 5'- 
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TTTTTTTTTTNAGCGCAAGA) (SEQ ID NO: 33). The reactions consisted of 
mRNA, the DNA splint, and the puromycin oligonucleotide in a molar ratio of 1.0 : 
1.5-2.0 : 1.0. An alternative molar ratio of 1.0 : 1.2 : 1.4 may also be utilized. A 
mixture of these components was first heated at 94 °C for 1 minute and then cooled on 

5 ice for 1 5 minutes. Ligation reactions were performed for one hour at room 

temperature in 50 mM Tris-HCl (pH 7.5), 10 mM MgCl 2 , 10 mM DTT, 1 mM ATP, 
25 jug/ml BSA, 15 \iM puromycin oligo, 15 jiM mRNA, 22.5-30 |aM DNA splint, 
RNasin inhibitor (Promega) at 1 U/jil, and 1.6-2.5 units of T4 DNA ligase per 
picomole of puromycin oligo. Following incubation, EDTA was added to a final 

10 concentration of 30 mM, and the reaction mixtures were extracted with 

phenol/chloroform. Full length conjugates were purified by denaturing PAGE, 
isolated by electroelution, and desalted. 

General Reticulocyte Translation Conditions . In addition to improving the 
synthesis of the mRNA-puromycin conjugate, translation reactions were also further 

15 optimized as follows. Reactions were performed in rabbit reticulocyte lysates from 

different commercial sources (Novagen, Madison, WI; Amersham, Arlington Heights, 
IL; Boehringer Mannheim, Indianapolis, IN; Ambion, Austin, TX; and Promega, 
Madison, WI). A typical reaction mixture (25 jil final volume) consisted of 20 mM 
HEPES pH 7.6, 2 mM DTT, 8 mM creatine phosphate, 100 mM KC1, 0.75 mM 

20 Mg(OAc) 2 , 1 mM ATP, 0.2 mM GTP, 25 |uM of each amino acid (0.7 \iM methionine 
if 35 S-Met was used), RNasin at 1 U/p.1, and 60% (v/v) lysate. The final concentration 
of template was in the range of 50 nM to 800 nM. For each incubation, all 
components except lysate were mixed carefully on ice, and the frozen lysate was 
thawed immediately before use. After addition of lysate, the reaction mixture was 

25 mixed thoroughly by gentle pipetting and incubated at 30° C to start translation. The 
optimal concentrations of Mg 24 and K + varied within the ranges of 0.25 mM - 2 mM 
and 75 mM - 200 mM, respectively, for different mRNAs and was preferably 
determined in preliminary experiments. Particularly for poorly translated mRNAs, 
the concentrations of hemin, creatine phosphate, tRNA, and amino acids were also 
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sometimes optimized. Potassium chloride was generally preferred over potassium 
acetate for fusion reactions, but a mixture of KC1 and KOAc sometimes produced 
better results. 

After translation at 30 °C for 30 to 90 minutes, the reaction was cooled on 

5 ice for 40 minutes, and Mg 2+ or K + were added. The final concentration of Mg 2+ 
added at this step was also optimized for different mRNA templates, but was 
generally in the range of 50 mM to 100 mM (with 50 mM being preferably used for 
pools of mixed templates). The amount of added K + was generally in the range of 125 
mM-1.5 M. For a Mg 2+ reaction, the resulting mixture was preferably incubated at - 

10 20 °C for 16 to 48 hours, but could be incubated for as little as 12 hours. If K + or 
Mg 2 7K + were added, the mixture was incubated at room temperature for, one hour. 

To visualize the labeled fusion products, 2 jlxI of the reaction mixture was 
mixed with 4 jil loading buffer, and the mixture was heated at 75 °C for 3 minutes. 
The resulting mixture was then loaded onto a 6% glycine SDS-polyacrylamide gel 

1 5 (for 32 P-labeled templates) or an 8% tricine SDS-polyacrylamide gel (for 35 S-Met- 
labeled templates). As an alternative to this approach, the fusion products may also 
be isolated using dT 25 streptavidin agarose or thiopropyl sepharose (or both), 
generally as described herein. 

To remove the RNA portion of the RNA-linker-puromycin-peptide 

20 conjugate for subsequent analysis by SDS-PAGE, an appropriate amount of EDTA 
was added after post-translational incubation, and the reaction mixture was desalted 
using a microcon-10 (or microcon-30) column. 2 jul of the resulting mixture 
(approximately 25 \i\ total) was mixed with 18 (il of RNase H buffer (30 mM Tris- 
HC1, pH 7.8, 30 mM (NH 4 ) 2 S0 4 , 8 mM MgCl 2? 1.5 mM p-mercaptoethanol, and an 

25 appropriate amount of complementary DNA splint), and the mixture was incubated at 
4°C for 45 minutes. RNase H was then added, and digestion was performed at 37 °C 
for 20 minutes. 

Quality of Puromvcin Olieo . The quality of the puromycin 
oligonucleotide was also important for the efficient generation of fusion products. 
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The coupling of 5-DMT, 2'-succinyl, N-trifluoroacetyl puromycin with CPG was not 
as efficient as the coupling of the standard nucleotides. As such, the coupling reaction 
was carefully monitored to avoid the formation of CPG with too low a concentration 
of coupled puromycin, and unreacted amino groups on the CPG were fully quenched 
5 to avoid subsequent synthesis of oligonucleotides lacking a 3 '-terminal puromycin. It 
was also important to avoid the use of CPG containing very fine mesh particles, as 
these were capable of causing problems with valve clogging during subsequent 
automated oligonucleotide synthesis steps. 

In addition, the synthesized puromycin oligo was preferably tested before 
10 large scale use to ensure the presence of puromycin at the 3' end. In our experiments, 
no fusion was detected if puromycin was substituted with a deoxy adenosine 
containing a primary amino group at the 3' end. To test for the presence of 3' 
hydroxyl groups (i.e., the undesired synthesis of oligos lacking a 3'-terminal 
puromycin), the puromycin oligo may first be radiolabeled (e.g., by 5'- 
1 5 phosphorylation) and then used as a primer for extension with terminal 

deoxynucleotidyl transferase. In the presence of a 3-terminal puromycin moiety, no 
extension product should be observed. 

Time Course of Translation and Post-Transla tional Incubation. The 
translation reaction was relatively rapid and was generally completed within 25 
20 minutes at 30° C. The fusion reaction, however, was slower. When a standard linker 
(dA 27 dCdCP) was used at 30 °C, fusion synthesis reached its maximum level in an 
additional 45 minutes. The post-translational incubation could be carried out at lower 
temperatures, for example, room temperature, 0°C, or -20°C. Less degradation of the 
mRNA template was observed at -20 °C, and the best fusion results were obtained 
25 after incubation at -20 °C for 2 days. 

The Effect of Me 2+ or K + Concentration . A high concentration of Mg 2+ or 
K + in the post-translational incubation greatly stimulated fusion formation. For 
example, for the myc RNA template described above, a 3-4 fold stimulation of fusion 
formation was observed using a standard linker (dA 27 dCdCP) in the presence of 50 
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mM Mg 2+ during the 16 hour incubation at -20°C (Figure 17, compare lanes 3 and 4). 
Efficient fusion formation was also observed using a post-translational incubation in 
the presence of a 50-100 mM Mg 2 + concentration when the reactions were carried out 
at room temperature for 30-45 minutes. Similarly, addition of 250 - 500 mM K + 
5 increased fusion formation by greater than 7 fold relative to the no added K + control. 
Optimum K + concentrations were generally between 300 mM and 600 mM (500 mM 
for pools). Post-translational addition of NH 4 C1 also increased fusion formation. The 
choice of OAc vs. CI as the anion did not have a profound effect on fusion formation. 

Linker Length and Sequence . The dependence of the fusion reaction on 
10 the length of the linker was also examined. In the range between 21 and 30 

nucleotides (n= 18-27), little change was seen in the efficiency of the fusion reaction 
(as described above). Similar results were obtained for linkers of 19 and 30 
nucleotides, and greatest fusion formation was observed for linkers of 25 nucleotides 
(Figure 23). Shorter linkers (e.g., 13 or 16 nucleotides in length) and longer linkers 
15 (e.g., linkers greater than 40 nucleotides in length) resulted in much lower fusion 
formation. In addition, although particular linkers of greater length (that is, of 45 
nucleotides and 54 nucleotides) also resulted in somewhat lower fusion efficiences, it 
remains likely that yet longer linkers may also be used to optimize the efficiency of 
the fusion reaction. 

20 With respect to linker sequence, substitution of deoxyribonucleotide 

residues near the 3' end with ribonucleotide residues did not significantly change the 
fusion efficiency. The dCdCP (or rCrCP) sequence at the 3' end of the linker was, 
however, important to fusion formation. Substitution of dCdCP with dUdUP reduced 
the efficiency of fusion formation significantly. 

25 Linker Flexibility . The dependence of the fusion reaction on the flexibility 

of the linker was also tested. In these experiments, it was determined that the fusion 
efficiency was low if the rigidity of the linker was increased by annealing with a 
complementary oligonucleotide near the 3' end. Similarly, when a more flexible 
linker (for example, dA 21 C 9 C 9 Cc,dAdCdCP, where C 9 represents HO(CH 2 CH 2 0) 3 P0 2 ) 
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was used, the fusion efficiency was significantly improved. Compared to the standard 
linker (dA 27 dCdCP), use of the more flexible linker (dA 21 C 9 C 9 C 9 dAdCdCP) improved 
the fusion efficiency for RNA124 more than 4-fold (Figure 17, compare lanes 1 and 
9). In addition, in contrast to the template with the standard linker whose post- 
5 translation fusion proceeded poorly in the absence of a high concentration of Mg 2+ 

(Figure 17, lane 3 and 4), the template with the flexible linker did not require elevated 
Mg 2+ to produce a good yield of fusion product in an extended post-translational 
incubation at -20°C (Figure 17, compare lanes 1 1 and 12). This linker, therefore, was 
very useful if post-translational additions of high concentrations of Mg 2+ were not 
10 desired. In addition, the flexible linker also produced optimal fusion yields in the 
presence of elevated Mg 2+ . 

Quantitation of Fusion Efficiency . Fusion efficiency may be expressed as 
either the fraction of translated peptide converted to fusion product, or the fraction of 
input template converted to fusion product. To determine the fraction of translated 
1 5 peptide converted to fusion product, 35 S-Met labeling of the translated peptide was 
utilized. In these experiments, when a dA 27 dCdCP or dA 27 rCrCP linker was used, 
about 3.5% of the translated peptide was fused to its mRNA after a 1 hour translation 
incubation at 30°C. This value increased to 12% after overnight incubation at -20°C. 
When the post-translational incubation was carried out in the presence of a high 
20 concentration of Mg 2+ , more than 50% of the translated peptide was fused to the 
template. 

For a template with a flexible linker, approximately 25% of the translated 
peptide was fused to the template after 1 hour of translation at 30 °C. This value 
increased to over 50% after overnight incubation at -20 °C and to more than 75% if 
25 the post-translational incubation was performed in the presence of 50 mM Mg 2 \ 
To determine the percentage of the input template converted to fusion 
product, the translations were performed using 32 P-labeled mRNA-linker template. 
When the flexible linker was used and post-translational incubation was performed at 
-20°C without addition of Mg 2+ , about 20%, 40%, 40%, 35%, and 20% of the input 
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template was converted to mRNA-peptide fusion when the concentration of the input 
RNA template was 800, 400, 200, 100, and 50 nM, respectively (Figure 18). Similar 
results were obtained when the post-translational incubation was performed in the 
presence of 50 mM Mg 2+ . The best results were achieved using lysates obtained from 
5 Novagen, Amersham, or Ambion (Figure 19). 

The mobility differences between mRNAs and mRNA-peptide fusions as 
measured by SDS-PAGE may be very small if the mRNA template is long. In such 
cases, the template may be labeled at the 5' end of the linker with 32 P (for example, 
using [ 32 P] yATP and T4 polynucleotide kinase prior to ligation of the mRNA- 
10 puromycin conjugate). The long RNA portion may then be digested with RNase H in 
the presence of a complementary DNA splint after translation/incubation, and the 
fusion efficiency determined by quantitation of the ratio of unmodified linker to 
linker-peptide fusion. Compared to RNase A digestion, which produces 3'-P and 5'- 
OH, this approach has the advantage that the 32 P at the 5 f end of the linker is not 
1 5 removed. 

For RNase H treatment, EDTA was added after posttranslational 
incubation to disrupt ribosomes, and the reaction mixture was desalted using a 
microcon-10 (or microcon-30) column. 2 yA of the resulting mixture was combined 
with 18 \il of RNase H buffer (30 mM Tris-HCl, pH7.8, 30 mM (NH 4 ) 2 S0 4 , 8 mM 

20 MgCl 2 , 1.5 mM p-mercaptoethanol, and an excess of complementary DNA splint) and 
incubated at 4°C for 45 minutes. RNase H was then added, and digestion was 
performed at 37 °C for 20 minutes. 

Intramolecular vs. Intermolecular Fusion During Post- Translational 
Incubation . In addition to the above experiments, we tested whether the fusion 

25 reaction that occurred at -20°C in the presence of Mg 2+ was intra- or intermolecular in 
nature. Free linker (dA 27 dCdCP or dA 21 C 9 C 9 C 9 dAdCdCP, where C 9 is - 
0(CH 2 CH 2 0) 3 P0 2 -) was coincubated with a template containing a DNA linker, but 
without puromycin at the 3' end, under the translation and post-translational 
incubation conditions described above. In these experiments, no detectable amount 
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(that is less than 2% of the normal level) of 35 S-Met was incorporated into linker- 
peptide product, suggesting that post-translational fusion occurred primarily between 
the nascent peptide and the mRNA bound to the same ribosome. 

In additional experiments, co-incubations were carried out with templates 
and puromycin oligonucleotides whose fusion products and cross-products (templates 
fused to the wrong protein) could be separated by electrophoresis. No cross-product 
formation was observed for any template and linker combination examined. In these 
experiments, fusion cross-products could form via two different trans mechanisms: (1) 
reaction of free templates or linkers with the peptide in a peptide-mRNA-ribosome 
complex or (2) reaction of the template of one complex with the peptide in another. 
One particular example of testing the latter possibility is shown in Figure 24. There, 
the lambda protein phosphatase (XPPase) template, which synthesizes a protein 221 
amino acids long, was coincubated with the myc template, which generates a 33 
amino acid peptide. By themselves, both templates demonstrate fusion formation 
after post-translation incubation. When mixed together, only the individual fusion 
products were observed. No cross-products resulting from fusion of the APPase protein 
with the myc template were seen. Similar experiments showed no cross-product 
formation with several different combinations: the myc template + the single codon 
template, a 20:1 ratio of the standard linker + the myc template, and the flexible linker 
+ the myc template. These experiments argued strongly against both possible 
mechanisms of trans fusion formation. 

The effect of linker length on fusion formation was also consistent with an 
in cis mechanism. Reduction of the linker length from 19 to 13 nucleotides resulted 
in an abrupt decrease in the amount of fusion product expected if the chain could no 
longer reach the peptidyl transferase center from the decoding site (Figure 23). 
However, this effect could also be due to occlusion of the puromycin within the 
ribosome if the trans mechanism dominated (e.g., if ribosome-bound templates 
formed fusion via a trans mechanism). The decrease in fusion formation with longer 
linkers again argues against this type of reaction, as no decrease should be seen for the 
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trans reaction once the puromycin is free of the ribosome. 

Optimization Results . As illustrated above, by using the flexible linker 
and/or performing the post-translational incubation in the presence of a high 
concentration of Mg 2 \ fusion efficiencies were increased to approximately 40% of 
5 input rnRNA. These results indicated that as many as 10 14 molecules of mRNA- 
peptide fusion could be generated per ml of in vitro translation reaction mix, 
producing pools of mRNA-peptide fusions of very high complexity for use in in vitro 
selection experiments. 

SEI JBCIIYE ENRICHMENT OF R NA-PROTEIN FUSIONS 

10 We have demonstrated the feasibility of using RNA-peptide fusions in 

selection and evolution experiments by enriching a particular RNA-peptide fusion 
from a complex pool of random sequence fusions on the basis of the encoded peptide. 
In particular, we prepared a series of mixtures in which a small quantity of known 
sequence (in this case, the long myc template, LP 154) was combined with some 

15 amount of random sequence pool (that is, LP160). These mixtures were translated, 
and the RNA-peptide fusion products selected by oligonucleotide and disulfide 
affinity chromatography as described herein. The myc-template fusions were 
selectively immunoprecipitated with anti-myc monoclonal antibody (Figure 16A). To 
measure the enrichment obtained in this selective step, aliquots of the mixture of 

20 cDNA/mRNA-peptide fusions from before and after the immunoprecipitation were 
amplified by PCR in the presence of a radiolabeled primer. The amplified DNA was 
digested with a restriction endonuclease that cut the myc template sequence but not 
the pool (Figures 16B and 16C). Quantitation of the ratio of cut and uncut DNA 
indicated that the myc sequence was enriched by 20-40 fold relative to the random 

25 library by immunoprecipitation. 

These experiments were carried out as follows. 

Translation Reactions . Translation reactions were performed generally as 
described above. Specifically, reactions were performed at 30°C for one hour 
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according to the manufacturer's specifications (Novagen) and frozen overnight at 
-20 °C. Two versions of six samples were made, one containing 35 S methionine and 
one containing cold methionine added to a final concentration of 52 jaM. Reactions 
1-6 contained the amounts of templates described in Table 2. All numbers in Table 2 
represent picomoles of template per 25 jixl reaction mixture. 

TABLE 2 

Template Ratios Used in Doped Selection 



Reaction LP154 LP160 
1 

2 5 

3 1 20 

4 0.1 20 

5 0.01 20 

6 - 20 



Preparation of dT . g Streptavidin Agarose . Streptavidin agarose (Pierce) 
was washed three times with TE 8.2 (10 mM Tris-Cl pH 8.2, 1 mM EDTA) and 
resuspended as a 1:1 (v/v) slurry in TE 8.2. 3' biotinyl T 25 synthesized using Bioteg 
CPG (Glen Research) was then added to the desired final concentration (generally 10 
or 20 |LiM), and incubation was carried out with agitation for 1 hour. The dT 25 
streptavidin agarose was then washed three times with TE 8.2 and stored at 4°C until 
use. 

Purification of Templates from Translation Reactions . To purify templates 
from translation reactions, 25 }il of each reaction was removed and added to 7.5 ml of 
Isolation Buffer (1 M NaCl, 100 mM Tris-Cl pH 8.2, 10 mM EDTA, 0.1 mM DTT) 
and 125 |nl of 20 |iM dT 25 streptavidin agarose. This solution was incubated at 4°C 
for one hour with rotation. The tubes were centrifuged and the eluent removed. One 
ml of Isolation Buffer was added, the slurry was resuspended, and the mixtures were 
transferred to 1 .5 ml microcentrifuge tubes. The samples were then washed four 
times with 1 ml aliquots of ice cold Isolation Buffer. Hot and cold samples from 
identical reactions were then combined in a Millpore MC filter unit and were eluted 
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from the dT 25 agarose by washing with 2 volumes of 100 \i\ H 2 0, 0.1 mM DTT, and 2 
volumes of 15 mM NaOH, 1 mM EDTA (4°C) followed by neutralization. 

To this eluent was added 40 |il of a 50% slurry of washed thiopropyl 
sepharose (Pharmacia), and incubation was carried out at 4°C with rotation for 1 hour. 
5 The samples were then washed with three 1 ml volumes of TE 8.2 and the eluent 

removed. One |il of 1 M DTT was added to the solid (total volume approximately 20- 
30 |il), and the sample was incubated for several hours, removed, and washed four 
times with 20 jitl H 2 0 (total volume 90 ^1). The eluent contained 2.5 mM 
thiopyridone as judged by UV absorbance. 50 ^1 of this sample was ethanol 
10 precipitated by adding 6 jil 3 M NaOAc pH 5.2, 10 mM spermine, 1 ^1 glycogen (10 
mg/ml, Boehringer Mannheim), and 170 \i\ 100% EtOH, incubating for 30 minutes at 
-70 °C, and centrifuging for 30 minutes at 13,000 rpm in a microcentrifuge. 

Reverse Transcriptase Reactions . Reverse transcription reactions were 
performed on both the ethanol precipitated and the thiopyridone eluent samples as 
15 follows. For the ethanol precipitated samples, 30 \i\ of resuspended template, H 2 0 to 
48 ill, and 200 picomoles of primer 21.103 (SEQ ID NO: 22) were annealed at 70°C 
for 5 minutes and cooled on ice. To this sample, 16 jul of first strand buffer (250 mM 
Tris-Cl pH 8.3, 375 mM KC1, and 15 mM MgCl 2 ; available from Gibco BRL, Grand 
Island, NY), 8 julI 100 mM DTT, and 4 jlxI 10 mM NTP were added and equilibrated at 
20 42 °C, and 4 (xl Superscript II reverse transcriptase (Gibco BRL, Grand Island, NY) 
was added. H 2 0 (13 was added to the TP sepharose eluent (35 nl), and reactions 
were performed as above. After incubation for one hour, like numbered samples were 
combined (total volume 160 jjlI). 10 jul of sample was reserved for the PCR of each 
unselected sample, and 150 |al of sample was reserved for immunoprecipitation. 
25 Immunoprecipitation . To carry out immunoprecipitations, 170 jllI of 

reverse transcription reaction was added to 1 ml of Dilution Buffer (10 mM Tris-Cl, 
pH 8.2, 140 mM NaCl, 1% v/v Triton X-100) and 20 \xl of Protein G/A conjugate 
(Calbiochem, La Jolla, CA), and precleared by incubation at 4°C with rotation for 1 
hour. The eluent was removed, and 20 [xl G/A conjugate and 20 |il of monoclonal 
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antibody (2 jug, 12 picomoles) were added, and the sample incubated with rotation for 
two hours at 4°C. The conjugate was precipitated by microcentrifugation at 2500 rpm 
for 5 minutes, the eluent removed, and the conjugate washed three times with 1 ml 
aliquots of ice cold Dilution Buffer. The sample was then washed with 1 ml ice cold 
10 mM TrisCl, pH 8.2, 100 mM NaCl. The bound fragments were removed using 3 
volumes of frozen 4% HOAc, and the samples were lyophilized to dryness. 

PCR of Selected and Unselected Samples . PCR reactions were carried out 
by adding 20 \il of concentrated NH 4 OH to 10 jil of the unselected material and the 
entirety of the selected material and incubating for 5 minutes each at 55 °C, 70 °C, and 
90° C to destroy any RNA present in the sample. The samples were then evaporated 
to dryness using a speedvac. 200 of PCR mixture (1 \iM primers 21 .103 and 
42. 108, 200 jiM dNTP in PCR buffer plus Mg 2+ (Boehringer Mannheim), and 2 pi of 
Taq polymerase (Boehringer Mannheim)) were added to each sample. 16 cycles of 
PCR were performed on unselected sample number 2, and 19 cycles were performed 
on all other samples. 

Samples were then amplified in the presence of 5' 32 P-labeled primer 
21.103 according to Table 3, and purified twice individually using Wizard direct PCR 
purification kits (Promega) to remove all primer and shorter fragments. 

TABLE 3 

Amplification of Selected and Unselected PCR Samples 



Sample 


Type 


Volume 


Cycles 


1 


unselected 


20 ul 


5 


2 


unselected 


5 ul 


4 


3 


unselected 


20 ul 


5 


4 


unselected 


20 jil 


5 


5 


unselected 


20 ul 


5 


6 


unselected 


20 ul 


5 


1 


selected 


20 ^1 


5 


2 


selected 


5 ul 


4 


3 


selected 


20 ul 


5 


4 


selected 


20 ul 


7 
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5 selected 20 ul 7 

6 selected 20 \il 7 

Restriction Digests . 32 P labeled DNA prepared from each of the above 
PCR reactions was added in equal amounts (by cpm of sample) to restriction digest 

5 reactions according to Table 4. The total volume of each reaction was 25 jil. 0.5 \il 
of Alwnl (5 units, New England Biolabs) was added to each reaction. Samples were 
incubated at 37 °C for 1 hour, and the enzyme was heat inactivated by a 20 minute 
incubation at 65 °C. The samples were then mixed with 10 \il denaturing loading 
buffer (1 ml ultrapure formamide (USB), 20 jil 0.5 M EDTA, and 20 nl 1 M NaOH), 

10 heated to 90 °C for 1 minute, cooled, and loaded onto a 12% denaturing 

polyacrylamide gel containing 8M urea. Following electrophoresis, the gel was fixed 
with 10% (v/v) HO Ac, 10% (v/v) MeOH, H 2 CX 

TABLE 4 



Restriction Digest Conditions w/ Alwnl 



15 


Sample 


Type 


Volume DNA 
added to reaction 


Total volume 




1 


unselected 


20 pi 


25 ul 




2 


unselected 


4 ul 


25 ul 




3 


unselected 


20 ul 


25 ul 


20 


4 


unselected 


20 ul 


25 ul 




5 


unselected 


4 ul 


25 ul 




6 


unselected 


20 ul 


25 Hi 




1 


selected 


20 ul 


25 ul 




2 


selected 


8 ul 


25 ul 


25 


3 


selected 


12 ul 


25 ul 




4 


selected 


12 ul 


25 ul 




5 


selected 


20 ul 


25 pi 




6 


selected 


20 ul 


25 pi 



Quantitation of Digest . The amount of myc versus pool DNA present in a 
30 sample was quantitated using a phosphorimager (Molecular Dynamics). The amount 
of material present in each band was determined as the integrated volume of identical 
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rectangles drawn around the gel bands. The total cpm present in each band was 
calculated as the volume minus the background. Three values of background were 
used: (1) an average of identical squares outside the area where counts occurred on 
the gel; (2) the cpm present in the unselected pool lane where the myc band should 

5 appear (no band appears at this position on the gel); and (3) a normalized value that 
reproduced the closest value to the 10-fold template increments between unselected 
lanes. Lanes 2, 3 , and 4 of Figures 16B and 16C demonstrate enrichment of the target 
versus the pool sequence. The demonstrable enrichment in lane 3 
(unselected/selected) yielded the largest values (17, 43, and 27 fold using methods 

10 1-3, respectively) due to the optimization of the signal to noise ratio for this sample. 
These results are summarized in Table 5. 

TABLE 5 

Enrichment of Myc Template vs. Pool 

Method Lane 2 (20) Lane 3 (200) Lane 4 (2000) 

15 1 7.0 16.6 5.7 

2 10.4 43 39 

3 8.7 27 10.2 



In a second set of experiments, these same PCR products were purified 
once using Wizard direct PCR purification kits, and digests were quantitated by 
20 method (2) above. In these experiments, similar results were obtained; enrichments of 
10.7, 38, and 12 fold, respectively, were measured for samples equivalent to those in 
lanes 2, 3, and 4 above. 

TN VTTRO SELECTION FROM A 
T ARQE RNA-PE PTTDR FT JSION LIBRARY 
25 In another experiment demonstrating selection of desired fusion molecules 

from large libraries, a repertoire of 2 x 10 13 randomized RNA-peptide fusions was 
generated using a modification of the method described above. A DNA library was 
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generated that contained 27 randomized codons based on the synthesis scheme 
5 , -(NNS) 27 -3 t (where N represents equimolar A, G, C and T, and S either G or C). 
Each NNS codon was a mixture of 32 triplets that included codons for all 20 natural 
amino acids. The randomized region was flanked by two primer binding sites for 
5 reverse transcription and PCR, as well as sequences encoding the T7 promoter and an 
initiation site for translation. RNA, synthesized by in vitro transcription, was 
modified by template-directed ligation to an oligonucleotide linker containing 
puromycin on its 3' terminus, dA 27 dCdC-P. 

Purified ligated RNA was in vitro translated in rabbit reticulocyte extract 
10 to generate RNA-protein fusions as follows: a 123-mer DNA PP.01 (5'-AGC TTT 
TGG TGC TTG TGC ATC (SNN)27 CTC CTC GCC CTT GCT CAC CAT-3', N - 
A, G, C, T; S = C, G) (SEQ ID NO: 34) was synthesized and purified on a 6% 
denaturing polyacrylamide gel. 1 nmol of the purified DNA (6 x 1 0 14 molecules) was 
amplified by 3 rounds of PCR (94°C, 1 minute; 65°C, 1 minute; 72°C, 2 minutes) 
1 5 using 1 primers P1F (5'-AGC TTT TGG TGC TTG TGC ATC-3') (SEQ ID NO: 
35) and PT7 (5'-TAA TAC GAC TCA CTA TAG GGA CAA TTA CTA TTT ACA 
ATT ACA ATG GTG AGC AAG GGC GAG GAG-3') (SEQ ID NO: 36) in a total 
volume of 5 ml (50 mM KC1, 10 mM Tris-HCl pH 9.0, 0.1 % Triton X-100, 2.5 mM 
MgCl 2 , 0.25 mM dNTPs, 500 Units Promega Taq Polymerase). After precipitation, 
20 the DNA was redissolved in 1 00 \il TE (10 mM Tris-HCl pH 7.6, 1 mM EDTA pH 
8.0). DNA (60 Jul) was transcribed into RNA in a reaction (1 ml) using the 
Megashortscript In vitro Transcription kit from Ambion. The reaction was extracted 
twice with phenol/CHCl 3 and excess NTPs were removed by purification on a 
NAP -25 column (Pharmacia). The puromycin containing linker 30-P (5'-dA 27 dCdCP) 
25 was synthesized as described herein and added to the 3'-end of the RNA library by 
template-directed ligation. RNA (25 nmol) were incubated with equimolar amounts 
of linker and splint (5'-TTT TTT TTT TNA GCT TTT GGT GCT TG 3') (SEQ ID 
NO: 37) in a reaction (1 .5 ml) containing T4 DNA ligase buffer (Promega) and 1200 
Units T4 DNA ligase (Promega). After incubation at room temperature for 4 hours, 
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ligated RNA was separated from unligated RNA on a 6 % denaturing polyacrylamide 
gel, eluted from the gel, and redissolved (200 ul ddH 2 0). To generate mRNA-peptide 
fusion molecules, ligated RNA (1.25 nmol) was translated in a total volume of 7.5 ml 
using the Rabbit Reticulocyte IVT kit from Ambion in the presence of 3.7 uCi 
5 ^S-methionine. Aft er incubation (30 minutes at 30°C), the reaction was brought to a 
final concentration of 530 mM KC1 and 150 raM MgCl 2 and incubated for a further 1 
hour at room temperature. Fusion formation was enhanced about 10-fold by this 
addition of 530 mM KC1 and 1 50 mM MgCl 2 after the translation reaction was 
completed. 

10 Using this improved method, about 10 13 purified fusion molecules per ml 

were obtained. RNA-peptide fusions were purified from the crude translation reaction 
by oligonucleotide affinity chromatography, and the RNA portion of the joint 
molecules was reverse transcribed prior to the selection step using RNase H-free 
reverse transcriptase as follows. Translated fusion products were incubated with dT 25 

15 cellulose (Pharmacia) in incubation buffer (1 00 mM Tris-HCl pH 8.0, 10 mM EDTA 
pH 8.0, 1 M NaCl and 0.25 % Triton X-100; 1 hour at 4°C). The cellulose was 
isolated by filtration and washed with incubation buffer, followed by elution of the 
fusion products with ddH 2 0. The RNA was reverse transcribed (25 mM Tris-HCl pH 
8.3, 75 mM KC1, 3 mM MgCl 2 , 10 mM DTT, and 0.5 mM dNTPs with 2 Units of 

20 Superscript II Reverse Transcriptase (Gibco BRL)) using a 5-fold excess of splint as 
primer. 

To explore the power of the RNA-protein fusion selection technology, the 
library was used to select peptides that bound to a c-myc monoclonal antibody using 
immunoprecipitation as the selection tool. Five rounds of repeated selection and 
25 amplification resulted in increased binding of the population of fusion molecules to 
the anti-myc monoclonal antibody 9E10 (Evan et al., Mol. Cell Biol. 5:3610 (1985)). 
Less than 1% of the library applied to the selection step was recovered by elution in 
each of the first three rounds of selection; however, about 10% of the library bound to 
the antibody and was eluted in the fourth selection round. The proportion of binding 
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molecules increased to 34% in the fifth round of selection. This result agreed well 
with the percentage of a wild type c-myc fusion construct that bound to the anti-myc 
antibody under these conditions (35%). In the sixth round of selection, no further 
enrichment was observed, and fusion molecules from the fifth and sixth rounds were 
5 used for characterization and sequence determination of the selected peptides. 

To carry out these experiments, the starting library of 2 x 10 13 molecules 
was incubated with a 12-fold excess of the c-myc binding antibody 9E10 (Chemicon) 
in selection buffer (IX PBS, 0.1 % BSA, 0.05 % Tween) for 1 hour at 4°C. The 
peptide fusion - antibody complexes were precipitated by adding protein A - 
10 sepharose. After additional incubation for 1 hour at 4°C, the sepharose was isolated 
by filtration, and the flow through (FT) was collected. The sepharose was washed 
with five volumes of selection buffer (Wl - W5) to remove non-specific binders and 
binding peptides were eluted with four volumes of 15 mM acetic acid (El - E4). The 
cDNA portion of the eluted fusion molecules was amplified by PCR, and the resulting 
1 5 DNA was used to generate an enriched population of fusion products, which was 

submitted to further rounds of selection. In order to remove peptides with affinity for 
protein A - sepharose from the pool, a pre-selection on protein A - sepharose was 
introduced in the second round of selection. The progress of the selection was 
monitored by determining the percentage of 35 S-labeled RNA-peptide fusion that was 
20 eluted from the immunoprecipitate with acetic acid. These results are shown in 
Figure 20. 

The pool of selected peptides was demonstrated to specifically bind the 
anti-myc antibody used for selection. Binding experiments with round 6 unfused 
peptides showed similar binding to the antibody compared to fused peptide, indicating 
25 that the nucleic acid portion of the fusion molecules was not needed for binding (data 
not shown). 

Fusion products from the sixth round of selection were evaluated under 
three different immunopreciptation conditions, as follows: (1) without the anti-myc 
antibody, (2) with the anti-integrin monoclonal antibody ASC-3 which is of the same 
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isotype, but does not bind the myc epitope, and (3) with the anti-myc antibody 9E10. 
Experiments were carried out by incubating 35 S-labeled RNA-peptide fusion products 
from the sixth round of selection (0.2 pmol) in selection buffer (IX PBS, 0.1 % BSA, 
0.05 % Tween) for 1 hour at 4°C either with anti-myc monoclonal Antibody 9E10 
5 (100 pmol), with anti-integrin p4 monoclonal antibody ASC-3 (100 pmol; Chemicon), 
or without antibody. Peptide fusion-antibody complexes were precipitated with 
Protein A-sepharose. After washing the sepharose with five volumes of selection 
buffer, bound species were eluted by the addition of 15 mM acetic acid. 

No significant binding could be detected in the control experiment without 
1 0 antibody, showing that the selected peptides did not bind nonspecifically to protein A 
- agarose. In addition, no binding to the anti-integrin monoclonal antibody was 
observed, indicating that the selected peptides were specific for the anti-myc antibody. 
A competition experiment with synthetic myc peptide was performed to determine 
whether the selected peptide fusion molecules interacted with the antigen-binding site 
15 of the anti-myc antibody 9E10. When 35 S~labeled fusion molecules from the sixth 

round of selection were incubated with anti-myc monoclonal antibody and increasing 
amounts of unlabeled myc peptide, the percentage of binding molecules decreased. 
These results are shown in Figure 21 . In this figure, 0.2 pmol 35 S-labeled 
RNA-peptide fusion products from the sixth round of selection were incubated with 
20 100 pmol anti-myc monoclonal antibody 9E10 in the presence of 0, 0.2, 1, 2, or 10 
nmol synthetic myc peptide (Calbiochem). The peptide fusion - antibody complexes 
were precipitated by addition of protein A - sepharose. The values represent the 
average percentage of fusion molecules that bound to the antibody and could be eluted 
with 15 mM acetic acid determined in triplicate binding reactions. The competition 
25 data demonstrated that the majority of the isolated fusion molecules were specific for 
the myc binding site. 

Sequence analysis of 1 16 individual clones derived from the fifth and sixth 
rounds of selection identified one sequence that occurred twice and contained the wild 
type c-myc epitope EQKLISEEDL (SEQ ID NO: 2). A third sequence was almost 
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identical to the other two, but showed two point mutations at the nucleotide level, one 
of which caused a mutation from He to Val in the conserved myc epitope region. All 
sequences contained a consensus motif, X(Q,E)XLISEXX(L,M) (SEQ ID NO: 38), 
which was very similar to the c-myc epitope. The core region of four amino acids, 
5 LISE, was most highly conserved. Figure 22 illustrates the amino acid sequences of 
12 selected peptides isolated from the random 27-mer library. At the top of the figure, 
the amino acid sequence of the c-myc epitope is shown. Of the sequences shown, 
only the regions containing the consensus motif are included. Residues within the 
peptides that match the consensus have been highlighted. Clone R6-63 contained the 
10 wild type myc epitope. Consensus residues ( > 50 % frequency at a given position) 
appear at the bottom of the figure. 

Taking into consideration that the conserved motif contained one amino 
acid that was coded for by the defined 5' primer region, we calculated that the known 
10 amino epitope c-myc epitope was represented only about 60 times in the starting 
15 pool of 2 x 10 13 molecules. The observed enrichment of the wild type epitope in five 
rounds of selection corresponded well with an enrichment factor of > 200 per 
selection round, a factor which was confirmed in a separate series of experiments. 

Immunoprecipitation assays performed on the twelve selected sequences 
shown in Figure 22 confirmed specific binding of the library-derived RNA-peptide 
20 fusions to the antigen-binding site of the anti-myc monoclonal antibody. As 

RNA-peptide fusions, all twelve sequences bound to the anti-myc antibody and 
exhibited no binding to protein A - sepharose. Competitive binding for the anti-myc 
antibody was also compared using 35 S-labeled fusion products (derived from the 
twelve sequences) and unlabeled synthetic myc peptide. Under the conditions used, 
25 labeled wild type myc fusion bound at 9% in the presence of unlabeled myc peptide, 
and the percentage of binding varied between 0.4% and 12% for the twelve sequences 
tested. These data indicated that the sequences bound the myc antibody with an 
affinity similar to that of the wild type myc fusion. 

PURIFICATION OF ARM MOTTF PEPTIDES 
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AND FUSIONS WITH IMMOBILIZED RNA 
RNA binding sites for the A-boxBR (Cilley and Williamson, RNA 3:57-67 
(1997)), BIV-TAR (Puglisi et al., Science 270:1200-1203 (1995)), and HI V-RRE 
(Battiste et al, Science 273:1547-1551 (1997)) were synthesized containing a 3' biotin 
5 moiety using standard phosphoramidite chemistry. The synthetic RNA samples were 
deprotected, desalted, and gel purified as described herein. The 3' biotinyl-RNA sites 
were then immobilized by mixing a concentrated stock of the RNA with a 50% v/v 
slurry of ImmunoPure streptavidin agarose (Pierce) in IX TE 8.2 at a final RNA 
concentration of 5 mM for one hour (25 °C) with shaking. Two translation reactions 
10 were performed containing (1) the template coding for the IN peptide fragment or (2) 
globin mRNA (Novagen) as a control. Aliquots (50 \il of a 50% slurry v/v) of each 
immobilized RNA were washed and resuspended in 500 ^1 in binding buffer (100 
mM KC1, 1 mM MgCl 2 , 10 mM Hepes-KOH pH 7.5, 0.5 mM EDTA, 0.01% NP-40, 1 
mM DTT, 50 ug/ml yeast tRNA). Binding reactions were performed by adding 15 ^il 
15 of the translation reaction containing either the N peptide or globin templates to tubes 
containing one of the three immobilized binding sites followed by incubation at room 
temperature for one hour. The beads were precipitated by centrifugation, washed 2X 
with 100 ix\ of binding buffer. RNase A (DNase free, 1 jil, 1 mg/ml) (Boehringer 
Mannheim) was added and incubated for one hour at 37° C to liberate bound 
20 molecules. The supernatant was removed and mixed with 30 ul of SDS loading buffer 
and analyzed by SDS-Tricine PAGE. The same protocol was used for isolation of N 
peptide fusions, with the exception that 35 mM MgCl 2 was added after the translation 
reaction followed by incubation at room temperature for one hour to promote fusion 
formation. 

25 The results of these experiments demonstrated that the N peptide retained 

its normal binding specificity both when synthesized in vitro and when generated as 
an RNA-peptide fusion with its own mRNA. This result was of critical importance. 
The attachment of a long nucleic acid sequence to the C terminus of a peptide or 
protein (i.e., fusion formation) has the potential to disrupt the polypeptide function 
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relative to the unfused sequence. Arginine rich motif (ARM) peptides represent a 
stringent functional test of the fusion system due to their relatively high nonspecific 
nucleic acid binding properties. The fact that the N peptide-mRNA fusion (prior to 
cDNA synthesis) retained the function of the free peptide indicates that specificity is 
5 maintained even when there is a likelihood of forming either self- or non-specific 
complexes. 

USE OF PROTEIN SELEC TION SYSTEMS 
The selection systems of the present invention have commercial 
applications in any area where protein technology is used to solve therapeutic, 
10 diagnostic, or industrial problems. This selection technology is useful for improving 
or altering existing proteins as well as for isolating new proteins with desired 
functions. These proteins may be naturally-occurring sequences, may be altered 
forms of naturally-occurring sequences, or may be partly or fully synthetic sequences. 
In addition, these methods may also be used to isolate or identify useful nucleic acid 
15 or small molecule targets. 

Isolation of Novel Binding Reagents . In one particular application, the 
RNA-protein fusion technology described herein is useful for the isolation of proteins 
with specific binding (for example, ligand binding) properties. Proteins exhibiting 
highly specific binding interactions may be used as non-antibody recognition 
20 reagents, allowing RNA-protein fusion technology to circumvent traditional 

monoclonal antibody technology. Antibody-type reagents isolated by this method 
may be used in any area where traditional antibodies are utilized, including diagnostic 
and therapeutic applications. 

Im provement of Human Antibodies . The present invention may also be 
25 used to improve human or humanized antibodies for the treatment of any of a number 
of diseases. In this application, antibody libraries are developed and are screened in 
vitro , eliminating the need for techniques such as cell-fusion or phage display. In one 
important application, the invention is useful for improving single chain antibody 
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libraries (Ward et al., Nature 341:544 (1989); and Goulot et al., J. Mol. Biol. 213:617 
(1990)). For this application, the variable region may be constructed either from a 
human source (to minimize possible adverse immune reactions of the recipient) or 
may contain a totally randomized cassette (to maximize the complexity of the library). 
5 To screen for improved antibody molecules, a pool of candidate molecules are tested 
for binding to a target molecule (for example, an antigen immobilized as shown in 
Figure 2). Higher levels of stringency are then applied to the binding step as the 
selection progresses from one round to the next. To increase stringency, conditions 
such as number of wash steps, concentration of excess competitor, buffer conditions, 
10 length of binding reaction time, and choice of immobilization matrix are altered. 

Single chain antibodies may be used either directly for therapy or 
indirectly for the design of standard antibodies. Such antibodies have a number of 
potential applications, including the isolation of anti-autoimmune antibodies, immune 
suppression, and in the development of vaccines for viral diseases such as AIDS. 
15 Isolation of New Catalysts . The present invention may also be used to 

select new catalytic proteins. In vitro selection and evolution has been used 
previously for the isolation of novel catalytic RNAs and DNAs, and, in the present 
invention, is used for the isolation of novel protein enzymes. In one particular 
example of this approach, a catalyst may be isolated indirectly by selecting for 
20 binding to a chemical analog of the catalyst's transition state. In another particular 
example, direct isolation may be carried out by selecting for covalent bond formation 
with a substrate (for example, using a substrate linked to an affinity tag) or by 
cleavage (for example, by selecting for the ability to break a specific bond and thereby 
liberate catalytic members of a library from a solid support). 
25 This approach to the isolation of new catalysts has at least two important 

advantages over catalytic antibody technology (reviewed in Schultz et al., J. Chem. 
Engng. News 68:26 (1990)). First, in catalytic antibody technology, the initial pool is 
generally limited to the immunoglobulin fold; in contrast, the starting library of 
RNA-protein fusions may be either completely random or may consist, without 
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limitation, of variants of known enzymatic structures or protein scaffolds. In addition, 
the isolation of catalytic antibodies generally relies on an initial selection for binding 
to transition state reaction analogs followed by laborious screening for active 
antibodies; again, in contrast, direct selection for catalysis is possible using an 
5 RNA-protein fusion library approach, as previously demonstrated using RNA 
libraries. In an alternative approach to isolating protein enzymes, the 
transition-state-analog and direct selection approaches may be combined. 

Enzymes obtained by this method are highly valuable. For example, there 
currently exists a pressing need for novel and effective industrial catalysts that allow 
10 improved chemical processes to be developed. A major advantage of the invention is 
that selections may be carried out in arbitrary conditions and are not limited, for 
example, to in vivo conditions. The invention therefore facilitates the isolation of 
novel enzymes or improved variants of existing enzymes that can carry out highly 
specific transformations (and thereby minimize the formation of undesired 
1 5 byproducts) while functioning in predetermined environments, for example, 
environments of elevated temperature, pressure, or solvent concentration. 

An In Vitro Interaction Trap . The RNA-protein fusion technology is also 
useful for screening cDNA libraries and cloning new genes on the basis of 
protein-protein interactions. By this method, a cDNA library is generated from a 
20 desired source (for example, by the method of Ausubel et al., supra, chapter 5). To 

each of the candidate cDNAs, a peptide acceptor (for example, as a puromycin tail) is 
ligated (for example, using the techniques described above for the generation of LP77, 
LP154, and LP160). RNA-protein fusions are then generated as described herein, and 
the ability of these fusions (or improved versions of the fusions) to interact with 
25 particular molecules is then tested as described above. If desired, stop codons and 3' 
UTR regions may be avoided in this process by either (i) adding suppressor tRNA to 
allow readthrough of the stop regions, (ii) removing the release factor from the 
translation reaction by immunoprecipitation, (iii) a combination of (i) and (ii), or (iv) 
removal of the stop codons and 3' UTR from the DNA sequences. 
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The fact that the interaction step takes place in vitro allows careful control 
of the reaction stringency, using nonspecific competitor, temperature, and ionic 
conditions. Alteration of normal small molecules with non-hydrolyzable analogs 
(e.g., ATP vs. ATPgS) provides for selections that discriminate between different 

5 conformers of the same molecule. This approach is useful for both the cloning and 
functional identification of many proteins since the RNA sequence of the selected 
binding partner is covalently attached and may therefore be readily isolated. In 
addition, the technique is useful for identifying functions and interactions of the 
~5 0-1 00,000 human genes, whose sequences are currently being determined by the 

1 0 Human Genome project. 

USE OF RN A - PROTEIN FUSIONS IN A MICROC HIP FORMAT 
"DNA chips" consist of spatially defined arrays of immobilized 
oligonucleotides or cloned fragments of cDNA or genomic DNA, and have 
applications such as rapid sequencing and transcript profiling. By annealing a 

15 mixture of RNA-protein fusions (for example, generated from a cellular DNA or RNA 
pool), to such a DNA chip, it is possible to generate a "protein display chip," in which 
each spot corresponding to one immobilized sequence is capable of annealing to its 
corresponding RNA sequence in the pool of RNA-protein fusions. By this approach, 
the corresponding protein is immobilized in a spatially defined manner because of its 

20 linkage to its own mRNA, and chips containing sets of DNA sequences display the 
corresponding set of proteins. Alternatively, peptide fragments of these proteins may 
be displayed if the fusion library is generated from smaller fragments of cDNAs or 
genomic DNAs. 

Such ordered displays of proteins and peptides have many uses. For 
25 example, they represent powerful tools for the identification of previously unknown 
protein-protein interactions. In one specific format, a probe protein is detectably 
labeled (for example, with a fluorescent dye), and the labeled protein is incubated 
with a protein display chip. By this approach, the identity of proteins that are able to 
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bind the probe protein are determined from the location of the spots on the chip that 
become labeled due to binding of the probe. Another application is the rapid 
determination of proteins that are chemically modified through the action of 
modifying enzymes (for example, protein kinases, acyl transferases, and methyl 
5 transferases). By incubating the protein display chip with the enzyme of interest and a 
radioactively labeled substrate, followed by washing and autoradiography, the 
location and hence the identity of those proteins that are substrates for the modifying 
enzyme may be readily determined. In addition, the use of this approach with ordered 
displays of small peptides allows the further localization of such modification sites. 
I o Protein display technology may be carried out using arrays of nucleic acids 

(including RNA, but preferably DNA) immobilized on any appropriate solid support. 
Exemplary solid supports may be made of materials such as glass (e.g., glass plates), 
silicon or silicon-glass (e.g., microchips), or gold (e.g., gold plates). Methods for 
attaching nucleic acids to precise regions on such solid surfaces, e.g., 
1 5 photolithographic methods, are well known in the art, and may be used to generate 
solid supports (such as DNA chips) for use in the invention. Exemplary methods for 
this purpose include, without limitation, Schena et al., Science 270:467-470 (1995); 
Kozal et al., Nature Medicine 2:753-759 (1996); Cheng et al., Nucleic Acids Research 
24:380-385 (1996); Lipshutz et al., BioTechniques 19:442-447 (1995); Pease et al., 
20 Proc. Natl. Acad. Sci. USA 91:5022-5026 (1994); Fodor et al., Nature 364:555-556 
(1993); Pirrung et al., U.S. Patent No. 5,143,854; and Fodor et al., WO 92/10092. 
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Claims 

1 . A method for producing a protein library, comprising the steps of: 

a) providing a population of RNA molecules, each of which comprises a 
translation initiation sequence and a start codon operably linked to a protein coding 

5 sequence and each of which is operably linked to a peptide acceptor at the 3' end of 
said protein coding sequence; 

b) in vitro translating said protein coding sequences to produce a 
population of RNA-protein fusions; and 

c) further incubating said population of RNA-protein fusions under high 
1 0 salt conditions, thereby producing a protein library. 

2. A method for producing a DNA library, comprising the steps of: 

a) providing a population of RNA molecules, each of which comprises a 
translation initiation sequence and a start codon operably linked to a protein coding 
sequence and each of which is operably linked to a peptide acceptor at the 3' end of 

1 5 said protein coding sequence; 

b) in vitro translating said protein coding sequences to produce a 
population of RNA-protein fusions; 

c) further incubating said population of RNA-protein fusions under high 
salt conditions; and 

20 d) generating from each of said RNA portions of said fusions a DNA 

molecule, thereby producing a DNA library. 

3. A method for the selection of a desired protein or nucleic acid encoding 
said protein, comprising the steps of: 

a) providing a population of candidate RNA molecules, each of which 
25 comprises a translation initiation sequence and a start codon operably linked to a 

candidate protein coding sequence and each of which is operably linked to a peptide 
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acceptor at the 3 f end of said candidate protein coding sequence; 

b) in vitro translating said candidate protein coding sequences to produce a 
population of candidate RNA-protein fusions; 

c) further incubating said population of candidate RNA-protein fusions 
5 under high salt conditions, thereby producing a protein library; and 

d) selecting a desired RNA-protein fusion, thereby selecting said desired 
protein and said nucleic acid encoding said protein. 

4. The method of any of claims 1-3, wherein said high salt comprises a 
monovalent cation. 

10 5 . The method of claim 4, wherein said monovalent cation is at a 

concentration of between approximately 125 mM - 1.5 M. 

6. The method of claim 5, wherein said monovalent cation is at a 
concentration of between approximately 300 mM - 600 mM. 

7. The method of claim 4, wherein said monovalent cation is K + or NH 4 + . 
15 8. The method of claim 4, wherein said monovalent cation is Na + . 

9. The method of claim 7, wherein said incubating step is carried out at 
approximately room temperature. 

10. The method of any of claims 1-3, wherein said high salt comprises a 
divalent cation. 



20 



1 1 . The method of claim 10, wherein said divalent cation is at a 
concentration of between approximately 25 mM - 200 mM. 
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12. The method of claim 10, wherein said divalent cation is Mg +2 . 

1 3 . The method of any of claims 1 -3, wherein said high salt comprises 
both a monovalent and a divalent cation. 

14. The method of any of claims 1 -3, wherein each of said RNA 

5 molecules further comprises a pause sequence or further comprises a DNA or DNA 
analog sequence covalently bonded to the 3' end of said RNA molecule. 

15. The method of claim 14, wherein said pause sequence or said DNA or 
DNA analog sequence is of a length sufficient to span the distance between the 
decoding site and the peptidyl transfer center of a ribosome. 

10 16. The method of claim 14, wherein said pause sequence or said DNA or 

DNA analog sequence is approximately 60-70 A° in length. 

17. The method of claim 14, wherein said pause sequence or said DNA or 
DNA analog sequence is less than approximately 80 nucleotides in length. 

18. The method of claim 14, wherein said pause sequence or said DNA or 
15 DNA analog sequence is less than approximately 45 nucleotides in length. 

19. The method of claim 14, wherein said pause sequence or said DNA or 
DNA analog sequence is between approximately 21-30 nucleotides in length. 

20. The method of claim 14, wherein said pause sequence or said DNA or 
DNA analog sequence is joined to said RNA molecule using a DNA splint. 



20 



21. The method of claim 14, wherein said pause sequence or said DNA or 
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DNA analog sequence comprises a non-nucleotide moiety. 

22. The method of claim 14, wherein said non-nucleotide moiety is one or 
more HO(CH 2 CH 2 0) 3 P0 2 moieties. 

23. The method of any of claims 1-3, wherein said RNA-protein fusion 

5 further comprises a nucleic acid or nucleic acid analog sequence positioned proximal 
to said peptide acceptor which increases flexibility. 
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SEQUENCE LISTING 

<110> The General Hospital Corporation 

<120> SELECTION OF PROTEINS USING RNA- PROTEIN 
FUSIONS 

<130> 00786/350WO5 

<150> 09/247,190 
<151> 1999-02-09 

<160> 33 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 123 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Translation template 
<400> 1 

rgrgrgrarg rgrarcrgra rararurgrg rararcrarg rarararcru rgrarurcru 6 0 

rcrurgrara rgrarargra rcrcrurgra rarcaaaaaa aaaaaaaaaa aaaaaaaaaa 12 0 

acc ±23 

<210> 2 
<211> 10 
<212> PRT 

<213> Homo sapiens 
<400> 2 

Glu Gin Lys Leu lie Ser Glu Glu Asp Leu 

15 10 

<210> 3 
<211> 277 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Translation template 
<400> 3 

rgrgrgrarc rararurura rcruraruru rurarcrara rururarcra rarurgrgrc 6 0 

rurgrararg rararcrarg rarararcru rgrarurcru rcrurgrara rgrarargra 12 0 

rcrcrurgrc rurgrcrgru rarararcrg rurcrgrurg rararcrarg rcrurgrara 18 0 

rarcrarcra rararcrurg rgrararcra rgrcrurgrc rgrurararc rurcrururg 24 0 

rcrgrcruaa aaaaaaaaaa aaaaaaaaaa aaaaacc 2 77 



<210> 4 
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<211> 34 
<212> PRT 

<213> Artificial Sequence 
<220> 

<22 3> Random peptide 

<221> VARIANT 

<222> (1) . . . (27) 

<22 3> Xaa is any amino acid. 

<221> VARIANT 

<222> (1) . . . (34) 

<22 3> Xaa = Any Amino Acid 

<400> 4 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 

15 10 15 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Gin Leu Arg Asn Ser 
20 25 30 

Cys Ala 



<210> 5 
<211> 50 
<212> RNA 

<213> Tobacco Mosaic Virus 
<400> 5 

rgrgrgrarc rararurura rcruraruru rurarcrara rururarcra 5 0 

<210> 6 
<211> 20 
<212> RNA 

<213> Escherichia coli 
<400> 6 

rgrgrargrg rarcrgrara 2 0 

<210> 7 
<211> 34 
<212> PRT 

<213> Homo sapiens 
<400> 7 

Met Ala Glu Glu Gin Lys Leu lie Ser Glu Glu Asp Leu Leu Arg Lys 

1 5 10 15 

Arg Arg Glu Gin Lys Leu Lys His Lys Leu Glu Gin Leu Arg Asn Ser 
20 25 30 

Cys Ala 



<210> 8 
<211> 29 
<212> DNA 



WO 00/47775 



PCT/US00/02589 



3 

<213> Artificial Sequence 

<220> 

<223> Translation template 
<400> 8 

aaaaaaaaaa aaaaaaaaaa aaaaaaacc 2 9 

<210> 9 
<211> 12 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Translation template 
<400> 9 

aaaaaaaaaa cc 12 

<210> 10 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Translation template 
<400> 10 

cgcggttttt attttttttt ttcc 24 

<210> 11 
<211> 55 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Translation template 
<400> 11 

rgrgrargrg rarcrgrara rarurgaaaa aaaaaaaaaa aaaaaaaaaa aaacc 5 5 

<210> 12 
<211> 55 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Translation template 
<400> 12 

rgrgrargrg rarcrgrara rcrurgaaaa aaaaaaaaaa aaaaaaaaaa aaacc 55 



<210> 13 
<211> 55 
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<212> RNA 

<213> Artificial Sequence 

<220> 

<223> Translation template 
<400> 13 

rgrgrargrg rarcrgrara rarurgaaaa aaaaaaaaaa aaaaaaaaaa aaacc 55 

<210> 14 
<211> 49 
<212> RNA 

<213> Artificial Sequence 

<220> 

<223> Translation template 
<400> 14 

rgrgrargrg rarcrgrara rcrurgaaaa aaaaaaaaaa aaaaaaacc 4 9 

<210> 15 
<211> 46 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Translation template 



<400> 15 

rgrgrargrg rarcrgrara rcrurgaaaa aaaaaaaaaa aaaacc 4 6 

<210> 16 
<211> 43 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Translation template 
<400> 16 

rgrgrargrg rarcrgrara rcrurgaaaa aaaaaaaaaa acc 4 3 

<210> 17 
<211> 289 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Translation template 

<221> misc_feature 
<222> (1) . . . (289) 
<223> n = A,T,C or G 
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<400> 17 

rgrgrgrarc rararurura rcruraruru rurarcrara rururarcra rarurgrnrn 6 0 

rsrnrnrsrn rnrsrnrnrs rnrnrsrnrn rsrnrnrsrn rnrsrnrnrs rnrnrsrnrn 12 0 

rsrnrnrsrn rnrsrnrnrs rnrnrsrnrn rsrnrnrsrn rnrsrnrnrs rnrnrsrnrn 18 0 

rsrnrnrsrn rnrsrnrnrs rnrnrsrnrn rsrnrnrsrc rargrcrurg rcrgrurara 24 0 

rcrurcruru rgrcrgrcru aaaaaaaaaa aaaaaaaaaa aaaaaaacc 28 9 



<210> 18 

<211> 64 

<212> DNA 

<213> Homo sapiens 



<400> 18 

gttcaggtct tcttgagaga tcagtttctg ttccatttcg tcctccctat agtgagtcgt 
atta 



60 
64 



<210> 19 

<211> 18 

<212> DNA 

<213> Homo sapiens 



<400> 19 

taatacgact cactatag 



18 



<210> 20 
<211> 12 
<212> PRT 

<213> Homo sapiens 



<400> 20 

Met Glu Gin Lys Leu He Ser Glu Glu Asp Leu Asn 
15 10 



<210> 21 

<211> 99 

<212> DNA 

<213> Homo sapiens 



<400> 21 

agcgcaagag ttacgcagct gttccagttt gtgtttcagc tgttcacgac gtttacgcag 60 
caggtcttct tcagagatca gtttctgttc ttcagccat 99 

<210> 22 
<211> 21 
<212> DNA 

<213> Homo sapiens 
<400> 22 

agcgcaagag ttacgcagct g 21 

<210> 23 
<21l> 63 
<212> DNA 

<213> Homo sapiens 
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taatacgact cactataggg acaattacta tttacaatta caatggctga agaacagaaa 60 

ctg 63 

<210> 24 
<211> 33 
<212> PRT 

<213> Homo sapiens 
<400> 24 

Met Ala Glu Glu Gin Lys Leu He Ser Glu Glu Asp Leu Leu Arg Lys 

15 10 15 

Arg Arg Glu Gin Leu Lys His Lys Leu Glu Gin Leu Arg Asn Ser Cys 
20 25 30 

Ala 



<210> 25 
<211> 49 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Primers for RNA pool 
<400> 25 

ccctgttaat gataaatgtt aatgttacgt cgacgcattg agataccga 49 

<210> 26 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primers for RNA pool 
<400> 26 

taatacgact cactataggg acaattacta tttacaatta ca 42 

<210> 27 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3;> Primers for RNA pool 



<400> 27 

agcgcaagag ttacgcagct g 21 

<210> 28 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
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<223> DNA splint 
<400> 28 

tttttttttt agcgcaaga 19 

<210> 29 

<211> 18 

<212> DMA 

<213> Homo sapiens 

<400> 29 

gtggtatttg tgagccag 18 

<210> 30 
<211> 40 
<212> DNA 
<213> Phage T7 

<400> 30 

taatacgact cactataggg acacttgctt ttgacacaac 40 

<210> 31 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> DNA splint 
<400> 31 

tttttttttt gtggtatttg 20 

<210> 32 

<211> 248 

<212> RNA 

<213> Homo sapiens 

<400> 32 

rgrgrgrarc rararurura rcruraruru rurarcrara rururarcra rarurgrgrc 60 

rurgrararg rararcrarg rarararcru rgrarurcru rcrurgrara rgrarargra 12 0 

rcrcrurgrc rurgrcrgru rarararcrg rurcrgrurg rararcrarg rcrurgrara 18 0 

rarcrarcra rararcrurg rgrararcra rgrcrurgrc rgrurararc rurcrururg 24 0 

rcrgrcru 24 8 

<210> 33 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> DNA splint 



<400> 33 

tttttttttt agcgcaaga 
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